Video encoder and decoder using motion-based segmentation and merging

ABSTRACT

This invention relates to motion compensated (MC) coding of video and to a MC prediction scheme which allows fast and compact encoding of motion vector fields retaining at the same time very low prediction error. By reducing prediction error and number of bits needed for representation of motion vector field, substantial savings of bit rate are achieved. Reduction of bit rate needed to represent motion field is achieved by merging segments in video frames, by adaptation of motion field model and by utilization of motion field model based on orthogonal polynomials.

FIELD OF THE INVENTION

The present invention generally relates to video compression. Moreprecisely, the invention relates to an encoder and method for performingmotion compensated encoding of video data. The invention also relates toa decoder for decoding video data thus encoded.

BACKGROUND OF THE INVENTION

Motion compensated prediction is a key element of the majority of videocoding schemes. FIG. 1 is a schematic diagram of an encoder forcompression of video sequences using motion compensation. Essentialelements in the encoder are a motion compensated prediction block 1, amotion estimator 2 and a motion field coder 3. The operating principleof the motion compensating video encoder is to compress the predictionerror E_(n)(x,y), which is a difference between the incoming frameI_(n)(x,y) to be coded, called the current frame, and a prediction frameP_(n)(x,y), wherein:

E_(n)(x,y)=I_(n)(x,y)−P_(n)(x,y)  (1)

Compression of the prediction error E_(n)(x,y) typically introduces someloss of information. The compressed prediction error, denoted {tildeover (E)}_(n)(x,y), is sent to the decoder. The prediction frameP_(n)(x,y) is constructed by the motion compensated prediction block 1and is built using pixel values of the previous, or some other alreadycoded frame denoted R_(ref)(x,y), called a reference frame, and motionvectors describing estimated movements of pixels between the currentframe and the reference frame. Motion vectors are calculated by motionfield estimators 2 and the resulting motion vector field is then codedin some way before applying it to the predictor block 1. The predictionframe is then:

P_(n)(x,y)=R_(ref) [x+Δx(x,y),y+Δy(x,y)]  (2)

The pair of numbers [Δx(x,y), Δy(x,y)] is called the motion vector of apixel in location (x,y) in the current frame, whereas Δx(x,y) andΔy(x,y) are the values of horizontal and vertical displacement of thispixel. The set of motion vectors of all pixels in the current frameI_(n)(x,y) is called motion vector field. The coded motion vector fieldis also transmitted as motion information to the decoder.

In the decoder shown in FIG. 2, pixels of the current frame I_(n)(x,y)are reconstructed by finding the pixels' predictions P_(n)(x,y) from thereference frame R_(ref)(x,y). The motion compensated prediction block 21generates the prediction frame using the received motion information andthe reference frame R_(ref)(x,y). In the prediction error decoder 22 adecoded prediction error frame {tilde over (E)}_(n)(x,y) is then addedwith the prediction frame, the result being the approximated currentframe Ĩ_(n).

The general object of the motion compensated (MC) prediction encoder isto minimise the amount of information which needs to be transmitted tothe decoder. It should minimise the amount of prediction error measuredaccording to some criteria, e.g. the energy associated with E_(n)(x,y),and minimise the amount of information needed to represent the motionvector field.

The document N. Nguen, E. Dubois, “Representation of motion informationfor image coding”. Proc. Picture Coding Symposium '90, Cambridge, Mass.,Mar. 26-18, 1990, pages 841-845, gives a review of motion field codingtechniques.

As a rule of the thumb, a reduction of prediction error requires a morerefined sophisticated motion field, i.e. more bits must be spent on itsencoding. Therefore the overall goal of the video encoding is to encodethe motion vector field as compactly as possible keeping at the sametime the measure of prediction error as low as possible.

Due to the very large number of pixels in the frame, it is not efficientto transmit a separate motion vector for each pixel. Instead, in most ofthe video coding schemes the current frame is divided into imagesegments, as shown for example in FIG. 3, so that all motion vectors ofthe segment can be described by a few parameters. Image segments can besquare blocks. For example 16×16 pixel blocks are used in codecs inaccordance with international standard ISO/IEC MPEG-1 or ITU-T H.261, orthey can comprise completely arbitrarily shaped regions obtained forinstance by a segmentation algorithm. In practice segments include atleast a few tens of pixels.

The motion field estimation block 1 of FIG. 1 calculates motion vectorsof all the pixels of a given segment which minimise some measure ofprediction error in this segment, for example the square predictionerror. Motion field estimation techniques differ both in the model ofthe motion field and in the algorithm for minimisation of the chosenmeasure of prediction error.

In order to compactly represent the motion vectors of the pixels in thesegments it is desirable that their values are described by a functionof few parameters. Such a function is called a motion vector fieldmodel. A known group of models are linear motion models, in which motionvectors are approximated by linear combinations of motion field basisfunctions. In such models the motion vectors of image segments aredescribed by a general formula: $\begin{matrix}{{{\Delta \quad {x( {x,y} )}} = {\sum\limits_{i = 1}^{N}{c_{i}{f_{i}( {x,y} )}}}}{{\Delta \quad {y( {x,y} )}} = {\sum\limits_{i = {N + 1}}^{N + M}{c_{i}{f_{i}( {x,y} )}}}}} & (3)\end{matrix}$

where parameters c_(i) are called motion coefficients and aretransmitted to the decoder. Functions f_(i)(x,y) are called motion fieldbasis functions and they have a fixed form known to both encoder anddecoder.

The problem when using the linear motion model having the abovedescribed formula is how to minimise in a computationally simple mannerthe number of motion coefficients c_(i) which are sent to the decoder,keeping at the same time some measure of distortion, e.g. a chosenmeasure of prediction error E_(n)(x,y), as low as possible.

The total amount of motion data which needs to be sent to the decoderdepends both on the number of segments in the image and the number ofmotion coefficients per segment. Therefore, there exist at least twoways to reduce the total amount of motion data.

The first way is to reduce the number of segments by combining (merging)together those segments which can be predicted with a common motionvector field model without causing a large increase of prediction error.The number of segments in the frame can be reduced because very oftenadjacent, i.e. neighbouring, segments can be predicted well with thesame set of motion coefficients. The process of combining such segmentsis called motion assisted merging. FIG. 3 shows a frame divided intosegments. The prior art techniques for motion coefficient coding includeseveral techniques for motion assisted merging. After motion vectors ofall the segments have been estimated, motion assisted merging isperformed. It is done by considering every pair of adjacent segmentsS_(i) and S_(j) with their respective motion coefficients c_(i) andc_(j). The area of combined segments S_(i) and S_(j) is denoted S_(ij).If the area S_(ij) can be predicted with one set of motion coefficientsc_(ij) without causing excessive increase of prediction error over theerror resulting from separate predictions of S_(i) and S_(j), then S_(i)and S_(j) are merged. The methods for motion assisted merging differessentially in the way of finding a single set of motion coefficientsc_(ij) which allow a good prediction of segments combined together.

One method is known as merging by exhaustive motion estimation. Thismethod estimates “from scratch” a new set of motion parameters c_(ij)for every pair of adjacent segments S_(i) and S_(j). If the predictionerror for S_(ij) is not excessively increased then the segments S_(i)and S_(i) are merged. Although this method can very well select thesegments which can be merged it is not feasible for implementationbecause it would increase the complexity of the encoder typically byseveral orders of magnitude.

Another method is known as merging by motion field extension. Thismethod tests whether area of S_(ij) can be predicted using either motionparameters c_(i) or c_(j) without an excessive increase of theprediction error. This method is characterised by very low computationalcomplexity because it does not require any new motion estimation.However, it very often fails to merge segments because motioncompensation with coefficients calculated for one segment very rarelypredicts well also the adjacent segments.

Still another method is known as merging by motion field fitting. Inthis method the motion coefficients c_(ij) are calculated by the methodof approximation. This is done by evaluating a few motion vectors ineach of the segments. Some motion vectors in segments S_(i) and S_(j)are depicted in FIG. 4. The motion field for the segment S_(ij) isdetermined by fitting a common motion vector field through these vectorsusing some known fitting method. The disadvantage of the method is thatthe motion field obtained by fitting is not precise enough and oftenleads to an unacceptable increase of prediction error.

The second way to minimise the number of motion coefficients is toselect for each segment a motion model which allows achievingsatisfactorily low prediction error with as few coefficients aspossible. Since the amount and the complexity of the motion variesbetween frames and between segments it is not efficient to always useall N+M motion coefficients per segment. It is necessary to find out forevery segment what is the minimum number of motion coefficients whichyields a satisfactorily low prediction error. Such a process of adaptiveselection of coefficients is called motion coefficient removal.

Methods for performing motion estimation with different models andselecting the most suitable one are proposed in H. Nicolas and C. Labit,“Region-based motion estimation using deterministic relaxation schemesfor image sequence coding,” Proc. 1994 International Conference onAcoustics, Speech and Signal Processing, pp. III265-268 and P. Cicconiand H. Nicolas, “Efficient region-based motion estimation and symmetryoriented segmentation for image sequence coding,” IEEE Tran. on Circuitsand Systems for Video Technology, Vol. 4, No. 3, June 1994, pp. 357-364.The methods try to adapt the motion model depending on the complexity ofthe motion by performing motion estimation with different models andselecting the most suitable one. The main disadvantage of these methodsis their high computational complexity and the small number of differentmotion field models which can be tested in practice.

Although the afore-described methods reduce the amount of motioninformation sent to the decoder to some extent while maintaining theaccuracy of predicted image at a reasonable level, there is still adesire to further reduce that amount.

SUMMARY OF THE INVENTION

An object of the present invention is to create a motion compensatedvideo encoder and a method of motion compensated encoding of video dataand a video decoder for decoding motion compensation encoded video data,which allow reducing the amount of motion vector field data produced bysome known motion estimator by a large factor without causing anunacceptable distortion of the motion vector field. The complexity ofthe motion field encoder should preferably be low for allowing practicalimplementation on available signal processors or general purposemicroprocessors.

According to a first aspect of the invention, an encoder for performingmotion compensated encoding of video data, comprising:

motion field estimating means, having an input for receiving a firstvideo data frame I_(n) and a reference frame R_(ref), said motion fieldestimating means being arranged to estimate a motion vector field (ΔX,ΔY) describing scene motion displacements of video frame pixels, andhaving an output for outputting said first video frame, said motionvector field and said reference frame R_(ref);

motion field encoding means having an input to receive from said motionfield estimating means said first estimated motion vector field;partitioning of a video frame into at least two segments said segmentsbeing a first segment S_(i) and a second segment S_(j); said first videodata frame and said reference frame R_(ref), said motion field encodingmeans being arranged to obtain compressed motion information comprisingfirst motion coefficients representing said motion vector field;

motion compensated prediction means for predicting a predicted videodata frame based on said reference frame R_(ref) and said compressedmotion information;

computing means having an input for receiving said first video dataframe and said predicted video data frame, said computing means beingarranged to calculate a prediction error frame based on said predictedvideo data frame and on said first video data frame;

prediction error encoding means for encoding said prediction errorframe;

means for transmitting said first motion coefficients and saidprediction error frame to a decoder;

is characterised by said motion encoding means further comprising:

means, for calculating and storing for each segment a distortion matrixE and a distortion vector y such that a predefined measure ΔE fordistortion in each segment, due to approximating said motion vectorfield as coefficients c_(i) of a set of polynomial basis functionsf_(i), is a function of (E c−y), c being a vector of said motioncoefficients c_(i);

means for decomposing said distortion matrix E into a first matrix Q anda second matrix R such that

det Q≠0 and

Q R=E,

 a subset of the set of all columns of matrix Q being a basis of avector space defined by all possible linear combinations of all columnvectors of matrix E, columns of matrix Q being orthogonal to each other;

means for calculating an auxiliary vector z according to z=Q⁻¹ y, Q⁻¹being the inverse matrix of said first matrix Q;

means for generating for each segment a column extended matrix Acomprising the columns of matrix R and vector z as an additional column,and for selecting all rows of matrix A which have elements unequal tozero in all columns due to R;

means for merging segments based on selective combination of segmentsproducing an increase in said prediction error within a certain limit;

means for generating a row extended matrix B comprising said selectedrows of matrix A of said first segment S_(i) and said selected rows ofmatrix A of said second segment S_(j);

means for performing a series of multiplications of rows of matrix Bwith scalars unequal to zero and additions of rows of matrix B in orderto obtain a modified matrix B′ having in the columns due to matrix R asmany rows as possible filled with zeros;

orthogonalising means receiving one of said matrices A, B and B′ as aninput matrix M, said orthogonalising means being arranged to replacesaid polynomial basis functions with f_(i) by orthogonal basis functions{tilde over (f)}_(i) and to calculate second motion coefficients {tildeover (c)} using said orthogonal basis functions and said input matrix M;and

quantisation means, having an input for receiving said second motioncoefficients {tilde over (c)}, said quantisation means being arranged toquantise said second coefficients {tilde over (c)}.

According to a second aspect of the invention, an encoder for performingmotion compensated encoding of video data, comprising:

motion field estimating means, having an input for receiving a firstvideo data frame I_(n) and a reference frame R_(ref), said motion fieldestimating means being arranged to estimate a motion vector field (ΔX,ΔY) describing scene motion displacements of video frame pixels andhaving an output to output said first video frame, said motion vectorfield and said reference frame R_(ref);

motion field encoding means having an input to receive from said motionfield estimating means said first estimated motion vector field;partitioning of a video frame into at least two segments said segmentsbeing a first segment S_(i) and a second segment S_(j); said first videodata frame and said reference frame R_(ref), said motion field encodingmeans being arranged to obtain compressed motion information comprisingfirst motion coefficients representing said motion vector field;

motion compensated prediction means for predicting a predicted videodata frame based on said reference frame R_(ref) and said compressedmotion information;

computing means having an input for receiving said first video dataframe and said predicted video data frame, said computing means beingarranged to calculate a prediction error frame based on said predictedvideo data frame and on said first video data frame;

prediction error encoding means for encoding said prediction errorframe;

means for transmitting said first motion coefficients and saidprediction error frame to a decoder;

is characterised by said motion encoding means further comprising:

means, for calculating and storing for each segment a distortion matrixE and a distortion vector y such that a predefined measure ΔE fordistortion in each segment, due to approximating said motion vectorfield as coefficients c_(i) of a set of polynomial basis functionsf_(i), is a function of (E c−y), c being a vector of said motioncoefficients c_(i);

means for decomposing said distortion matrix E into a first matrix Q anda second matrix R such that

det Q≠0, and

Q R=E,

 a subset of the set of all columns of matrix Q being a basis of avector space defined by all possible linear combinations of all columnvectors of matrix E, columns of matrix Q being orthogonal to each other;

means for calculating an auxiliary vector z according to z=Q⁻¹ y, Q⁻¹being the inverse matrix of said first matrix Q;

means for generating for each segment a column extended matrix Acomprising the columns of matrix R and vector z as an additional column,and for selecting all rows of matrix A which have elements unequal tozero in all columns due to matrix R;

means for merging segments based on selective combination of segmentsproducing an increase in said prediction error within a certain limit;

means for generating a row extended matrix B comprising said selectedrows of matrix A of said first segment S_(i) and said selected rows ofmatrix A of said second segment S_(j);

means for performing a series of multiplications of rows of matrix Bwith scalars unequal to zero and additions of rows of matrix B in orderto obtain a modified matrix B′ having in the columns due to matrix R asmany rows as possible filled with zeros;

orthogonalising means receiving one of said matrices A, B and B′ as aninput matrix M, said orthogonalising means being arranged to replacesaid polynomial basis functions with f_(i) by orthogonal basis functions{tilde over (f)}_(i) and to modify said input matrix to a third matrix{tilde over (M)} coressponding to said orthogonal basis functions;

removing means having an input for receiving of said third matrix {tildeover (M)}, said removing means being arranged to modify said thirdmatrix by removing from said third matrix the ith column due to Rcorresponding to the i^(th) basis functions of said basis functions, andsaid removing means have an output to provide a matrix {circumflex over(M)};

means for computing second motion coefficients {tilde over (c)} usingsaid fourth matrix {circumflex over (M)}; and

quantisation means, having an input for receiving said second motioncoefficients {tilde over (c)}, said quantisation means being arranged toquantise said second motion coefficients {tilde over (c)}.

A Method of motion compensated encoding of video data, according to afirst aspect of the invention, comprising the steps:

a) receiving a first video data frame I_(n) and a reference frame,estimating a motion vector field (ΔX, ΔY) describing scene motiondisplacements of video frame pixels, and outputting said first videoframe, said motion vector field and a reference frame R_(ref);

b) receiving said first estimated motion vector field; partitioning of avideo frame into at least two segments said segments being a firstsegment S_(i) and a second segment S_(j); said first video data frameand said reference frame R_(ref), for obtaining compressed motioninformation comprising first motion coefficients representing saidmotion vector field;

c) predicting a predicted video data frame based on said reference frameR_(ref) and said compressed motion information;

d) receiving said first video data frame to calculate a prediction errorframe based on said predicted video data frame and on said first videodata frame;

e) encoding said prediction error frame;

f) transmitting said first motion coefficients and said prediction errorframe to a decoder; is characterised by said method further comprisingthe steps:

g) calculating and storing for each segment a distortion matrix E and adistortion vector y such that a predefined measure ΔE for distortion ineach segment, due to approximating said motion vector field ascoefficients c_(i) of a set of polynomial basis functions f_(i), is afunction of (E c−y), c being a vector of said motion coefficients c_(i);

h) decomposing said distortion matrix E into a first matrix Q and asecond matrix R such that

det Q≠0 and

Q R=E,

 a subset of the set of all columns of matrix Q being a basis of avector space defined by all possible linear combinations of all columnvectors of matrix E, columns of matrix Q being orthogonal to each other;

i) calculating an auxiliary vector z according to

z=Q⁻¹ y , Q⁻¹ being the inverse matrix of said first matrix Q;

j) generating for each segment a column extended matrix A comprising thecolumns of matrix R and vector z as an additional column, and forselecting all rows of matrix A which have elements unequal to zero inall columns due to R;

k) merging segments based on selective combination of segments producingan increase in said prediction error within a certain limit;

l) generating a row extended matrix B comprising said selected rows ofmatrix A of said first segment S_(i) and said selected rows of matrix Aof said second segment S_(j);

m) performing a series of multiplications of rows of matrix B withscalars unequal to zero and additions of rows of matrix B in order toobtain a modified matrix B′ having in the columns due to matrix R asmany rows as possible filled with zeros;

o) receiving one of said matrices A, B and B′ as an input matrix M toreplace said polynomial basis functions with f_(i) by orthogonal basisfunctions {tilde over (f)}_(i) and to calculate second motioncoefficients {tilde over (c)} using said orthogonal basis functions andsaid input matrix M; and

q) receiving said and quantising second motion coefficients {tilde over(c)}.

A method of motion compensated encoding of video data, according to asecond aspect of the invention, comprising the steps:

a) receiving a first video data frame I_(n) and a reference frame,estimating a motion vector field (ΔX, ΔY) describing scene motiondisplacements of video frame pixels, and outputting said first videoframe, said motion vector field and a reference frame R_(ref);

b) receiving said first estimated motion vector field; partitioning of avideo frame into at least two segments said segments being a firstsegment S_(i) and a second segment S_(j); said first video data frameand said reference frame R_(ref), for obtaining compressed motioninformation comprising first motion coefficients representing saidmotion vector field;

c) predicting a predicted video data frame based on said reference frameR_(ref) and said compressed motion information;

d) receiving said first video data frame to calculate a prediction errorframe based on said predicted video data frame and on said first videodata frame;

e) encoding said prediction error frame;

f) transmitting said first motion coefficients and said prediction errorframe to a decoder;

is characterised by said method further comprising the steps:

g) calculating and storing for each segment a distortion matrix E and adistortion vector y such that a predefined measure ΔE for distortion ineach segment, due to approximating said motion vector field ascoefficients c_(i) of a set of polynomial basis functions f_(i), is afunction of (E c−y), c being a vector of said motion coefficients c_(i);

h) decomposing said distortion matrix E into a first matrix Q and asecond matrix R such that

det Q≠0 and

Q R=E,

 a subset of the set of all columns of matrix Q being a basis of avector space defined by all possible linear combinations of all columnvectors of matrix E, columns of matrix Q being orthogonal to each other;

i) calculating an auxiliary vector z according to

z=Q⁻¹ y , Q⁻¹ being the inverse matrix of said first matrix Q;

j) generating for each segment a column extended matrix A comprising thecolumns of matrix R and vector z as an additional column, and forselecting all rows of matrix A which have elements unequal to zero inall columns due to matrix R;

k) merging segments based on selective combination of segments producingan increase in said prediction error within a certain limit;

l) generating a row extended matrix B comprising said selected rows ofmatrix A of said first segment S_(i) and said selected rows of matrix Aof said second segment S_(j);

m) performing a series of multiplications of rows of matrix B withscalars unequal to zero and additions of rows of matrix B in order toobtain a modified matrix B′ having in the columns due to matrix R asmany rows as possible filled with zeros;

n) receiving one of said matrices A, B and B′ as an input matrix M, saidorthogonalising means being arranged to replace said polynomial basisfunctions with f_(i) by orthogonal basis functions {tilde over (f)}_(i)and to modify said input matrix to a third matrix {tilde over (M)}coressponding to said orthogonal basis functions;

o) receiving said third matrix {tilde over (M)} to modify said thirdmatrix {tilde over (M)} to a fourth matrix {circumflex over (M)} byremoving from said third matrix {tilde over (M)} the i^(th) column dueto matrix R corresponding to the i^(th) basis functions of said basisfunctions, and outputting said fourth matrix {circumflex over (M)};

p) computing second motion coefficients {tilde over (c)} using saidfourth matrix {circumflex over (M)}; and

q) receiving said second motion coefficients {tilde over (c)} andquantising said second motion coefficients {tilde over (c)}.

A video decoder for decoding of motion compensation encoded video data,according to the invention, is characterised by said decoder comprising:

means for storing a video data frame;

means for predicting a video data frame based on said stored video dataframe and on received motion information;

means for decoding received prediction error data and obtaining aprediction error frame; and

means for calculating and outputting an updated video data frame basedon said predicted video data frame and said decoded prediction errorframe, and storing the updated video data frame in said storing means;said means for predicting a video data frame comprising:

means for demultiplexing received motion data into at least two of thefollowing: data concerning the partitioning of said updated video dataframe into segments S_(i), data concerning a selection of basisfunctions from a set of motion field model basis functions, and dataconcerning coefficients of selected basis functions;

means for reconstructing said motion vector field in each segment S_(i)from a linear combination of said selected basis functions andcoefficients; and

means for calculating said prediction frame based on said reconstructedmotion vector field and based on said stored video data frame.

A system for handling video data, comprising an encoder for performingmotion compensated encoding of video data and a decoder for decodingsaid motion compensated encoding of video data, according to a firstaspect of the invention, is characterised by said encoder comprising:

motion field estimating means, having an input for receiving a firstvideo data frame I_(n) and a reference frame R_(ref), said motion fieldestimating means being arranged to estimate a motion vector field (ΔX,ΔY) describing scene motion displacements of video frame pixels, andhaving an output for outputting said first video frame, said motionvector field and said reference frame R_(ref);

motion field encoding means having an input to receive from said motionfield estimating means said first estimated motion vector field;partitioning of a video frame into at least two segments said segmentsbeing a first segment S_(i) and a second segment S_(j); said first videodata frame and said reference frame R_(ref), said motion field encodingmeans being arranged to obtain compressed motion information comprisingfirst motion coefficients representing said motion vector field;

motion compensated prediction means for predicting a predicted videodata frame based on said reference frame R_(ref) and said compressedmotion information;

computing means having an input for receiving said first video dataframe and said predicted video data frame, said computing means beingarranged to calculate a prediction error frame based on said predictedvideo data frame and on said first video data frame;

prediction error encoding means for encoding said prediction errorframe;

means for transmitting said first motion coefficients and saidprediction error frame to a decoder;

said motion encoding means further comprising:

means, for calculating and storing for each segment a distortion matrixE and a distortion vector y such that a predefined measure ΔE fordistortion in each segment, due to approximating said motion vectorfield as coefficients c_(i) of a set of polynomial basis functionsf_(i), is a function of (E c−y), c being a vector of said motioncoefficients c_(i);

means for decomposing said distortion matrix E into a first matrix Q anda second matrix R such that

det Q≠0 and

Q R=E,

 a subset of the set of all columns of matrix Q being a basis of avector space defined by all possible linear combinations of all columnvectors of matrix E, columns of matrix Q being orthogonal to each other;

means for calculating an auxiliary vector z according to

z=Q⁻¹ y, Q⁻¹ being the inverse matrix of said first matrix Q;

means for generating for each segment a column extended matrix Acomprising the columns of matrix R and vector z as an additional column,and for selecting all rows of matrix A which have elements unequal tozero in all columns due to R;

means for merging segments based on selective combination of segmentsproducing an increase in said prediction error within a certain limit;

means for generating a row extended matrix B comprising said selectedrows of matrix A of said first segment S_(i) and said selected rows ofmatrix A of said second segment S_(j);

means for performing a series of multiplications of rows of matrix Bwith scalars unequal to zero and additions of rows of matrix B in orderto obtain a modified matrix B′ having in the columns due to matrix R asmany rows as possible filled with zeros;

orthogonalising means receiving one of said matrices A, B and B′ as aninput matrix M, said orthogonalising means being arranged to replacesaid polynomial basis functions with f_(i) by orthogonal basis functions{tilde over (f)}_(i) and to calculate second motion coefficients {tildeover (c)} using said orthogonal basis functions and said input matrix M;and

quantisation means, having an input for receiving said second motioncoefficients {tilde over (c)}, said quantisation means being arranged toquantise said second coefficients {tilde over (c)}; and said decodercomprising:

means for storing a video data frame;

means for predicting a video data frame based on said stored video dataframe and on received motion information;

means for decoding received prediction error data and obtaining aprediction error frame; and

means for calculating and outputting an updated video data frame basedon said predicted video data frame and said decoded prediction errorframe, and storing the updated video data frame in said storing means;said means for predicting a video data frame further comprising:

means for demultiplexing received motion data into at least two of thefollowing: data concerning the partitioning of said updated video dataframe into segments S_(i), data concerning a selection of basisfunctions from a set of motion field model basis functions, and dataconcerning coefficients of selected basis functions;

means for reconstructing said motion vector field in each segment S_(i)from a linear combination of said selected basis functions andcoefficients; and

means for calculating said prediction frame based on said reconstructedmotion vector field and based on said stored video data frame.

A system for handling video data, comprising an encoder for performingmotion compensated encoding of video data and a decoder for decodingsaid motion compensated encoding of video data, according to a secondaspect of the invention, is characterised by said encoder comprising:

motion field estimating means, having an input for receiving a firstvideo data frame I_(n) and a reference frame R_(ref), said motion fieldestimating means being arranged to estimate a motion vector field (ΔX,ΔY) describing scene motion displacements of video frame pixels andhaving an output to output said first video frame, said motion vectorfield and said reference frame R_(ref);

motion field encoding means having an input to receive from said motionfield estimating means said first estimated motion vector field;partitioning of a video frame into at least two segments said segmentsbeing a first segment S_(i) and a second segment S_(j); said first videodata frame and said reference frame R_(ref), said motion field encodingmeans being arranged to obtain compressed motion information comprisingfirst motion coefficients representing said motion vector field;

motion compensated prediction means for predicting a predicted videodata frame based on said reference frame R_(ref) and said compressedmotion information;

computing means having an input for receiving said first video dataframe and said predicted video data frame, said computing means beingarranged to calculate a prediction error frame based on said predictedvideo data frame and on said first video data frame;

prediction error encoding means for encoding said prediction errorframe;

means for transmitting said first motion coefficients and saidprediction error frame to a decoder; said motion encoding means furthercomprising:

means, for calculating and storing for each segment a distortion matrixE and a distortion vector y such that a predefined measure ΔE fordistortion in each segment, due to approximating said motion vectorfield as coefficients c_(i) of a set of polynomial basis functionsf_(i), is a function of (E c−y), c being a vector of said motioncoefficients c_(i);

means for decomposing said distortion matrix E into a first matrix Q anda second matrix R such that

det Q≠0, and

Q R=E,

 a subset of the set of all columns of matrix Q being a basis of avector space defined by all possible linear combinations of all columnvectors of matrix E, columns of matrix Q being orthogonal to each other;

means for calculating an auxiliary vector z according to z=Q⁻¹ y, Q⁻¹being the inverse matrix of said first matrix Q;

means for generating for each segment a column extended matrix Acomprising the columns of matrix R and vector z as an additional column,and for selecting all rows of matrix A which have elements unequal tozero in all columns due to matrix R;

means for merging segments based on selective combination of segmentsproducing an increase in said prediction error within a certain limit;

means for generating a row extended matrix B comprising said selectedrows of matrix A of said first segment S_(i) and said selected rows ofmatrix A of said second segment S_(j);

means for performing a series of multiplications of rows of matrix Bwith scalars unequal to zero and additions of rows of matrix B in orderto obtain a modified matrix B′ having in the columns due to matrix R asmany rows as possible filled with zeros;

orthogonalising means receiving one of said matrices A, B and B′ as aninput matrix M, said orthogonalising means being arranged to replacesaid polynomial basis functions with f_(i) by orthogonal basis functions{tilde over (f)}_(i) and to modify said input matrix to a third matrix{tilde over (M)} coressponding to said orthogonal basis functions;

removing means having an input for receiving of said third matrix {tildeover (M)}, said removing means being arranged: to modify said thirdmatrix by removing from said third matrix the i^(th) column due to Rcorresponding to the i^(th) basis functions of said basis functions, andsaid removing means have an output to provide a matrix {circumflex over(M)};

means for computing second motion coefficients {tilde over (c)} usingsaid fourth matrix {circumflex over (M)}; and

quantisation means, having an input for receiving said second motioncoefficients {tilde over (c)}, said quantisation means being arranged toquantise said second motion coefficients {tilde over (c)} and to outputsaid second motion coefficients after quantisation; and said decodercomprising:

means for storing a video data frame;

means for predicting a video data frame based on said stored video dataframe and on received motion information;

means for decoding received prediction error data and obtaining aprediction error frame; and

means for calculating and outputting an updated video data frame basedon said predicted video data frame and said decoded prediction errorframe, and storing the updated video data frame in said storing means;said means for predicting a video data frame comprising:

means for demultiplexing received motion data into at least two of thefollowing: data concerning the partitioning of said updated video dataframe into segments S_(i), data concerning a selection of basisfunctions from a set of motion field model basis functions, and dataconcerning coefficients of selected basis functions;

means for reconstructing said motion vector field in each segment S_(i)from a linear combination of said selected basis functions andcoefficients; and

means for calculating said prediction frame based on said reconstructedmotion vector field and based on said stored video data frame.

The embodiments of the invention are defined in the dependent claims.

In accordance with a first embodiment of the invention the encoder canperform merging of segments and find for a merged segment, in acomputationally efficient way, a set of motion coefficients minimising ameasure of distortion. This aspect of the invention also allows a simpleand efficient evaluation of distortion due to segment merging, ifdesired. Advantageously, the encoder according to this aspect performsadaptive merging of adjacent segments of a video frame by means ofcalculating an additional distortion according to a predefined measureand merging the segments if the additional distortion is tolerable, e.g.below a given threshold or tolerable with regard to an achieved bit ratereduction. A chosen measure for calculating this distortion can be, butis not limited to, some measure of prediction error, for instance theenergy or squared prediction error in the segment. Another measure ofdistortion can be e.g. the squared difference between an original frameand the restored original frame after encoding and decoding. For thispurpose the motion field encoder included in the video encoder comprisesthree main blocks.

The first main block may be called a QR motion analyser. Its task is tofind a new representation of the inputted motion vector field producedby the motion field estimator. This new representation is applied to thesecond main block. Operations in this first main block include aplurality of steps comprising matrix operations: In the first step theprediction frame is linearised using some known approximation method sothat the prediction frame becomes a linear function of the motionvectors. In the second step a matrix E_(i) and a matrix y_(i) areconstructed for minimisation of an appropriate measure of predictionerror, e.g. the square prediction error. Matrix E_(i) is decomposed intoa product of two matrices Q_(i) and R_(i). In addition, an auxiliaryvector z_(i) is calculated from the factor matrix Q_(i) and the matrixy_(i). Part of the matrix R_(i) and the auxiliary vector z_(i) areapplied to the second main block.

The second main block, called a segment merging block, performs mergingoperations for pairs of segments S_(i), S_(j). Advantageously, thisblock checks whether the motion in the combined area of S_(i) and S_(j)can be predicted using a common motion field model for the combinedarea. In the merging operations a matrix equation is firstly formedbased on said factor matrices, thereafter the factor matrices areprocessed by using known matrix computation methods. The result is amatrix equation which allows calculation of motion coefficients commonfor the pair of segments under consideration, in a simple and efficientway. Using these coefficients the chosen measure of distortion can becalculated in the area of the merged segments. If the square predictionerror is used as the measure of distortion, it can be easily calculatedon the basis of the terms included in one of the resulting matrices. Ifthe change of said measure of prediction error is acceptable accordingto a chosen criterion, the segments are merged. It will be appreciatedthat no merging occurs if merging of segments would cause excessivedistortion to the prediction.

After all pairs of segments are considered the output of the segmentmerging block is a new segmentation of the image with a reduced numberof segments. Moreover, for each new segment the block outputs a matrixR_(k) ¹ and a vector z_(k) ¹, which allow calculation of all motioncoefficients in a simple and efficient way. Also, the encoder providesthe decoder with information enabling the reconstruction of theresulting new segments in the frame.

The segment merging block according to this aspect of the inventionallows a computationally simple judgement whether segments can be mergedinto one. Merging of as many segments as possible can be achieved byjudging for each pair of adjacent segments whether merging is possible,and repeating this process for the resulting segmentation of the frameuntil no pair of adjacent segments suitable for merging remains in theframe.

According to an embodiment of the invention the amount of distortionintroduced by the segment merging can be calculated based on a linearapproximation of additional distortion due to approximating the motionvector field by a linear combination of basis functions.

The third main block is called an orthogonalisation block. This blockreceives as its input the partitioning of the current frame intosegments and for every segment S_(k) matrices R_(k) ¹ and Z_(k) ¹ fromthe segment merging block.

The block replaces the polynomial basis functions which describe themotion vectors of the image segments by orthogonal polynomials.Orthogonalisation leads to a motion field model which is less sensitiveto quantisation errors resulting from quantising the motion coefficientsand to representation of the motion vector field with fewer bits.

Additionally, an encoder according to a second aspect of the presentinvention includes a fourth main block, which allows removingcoefficients from the set of coefficients representing the motion vectorfield of a segment and finding, in a computationally efficient way, theoptimum remaining coefficients having regard to a measure of distortion.Also, if desired, the invention according to this aspect enableschecking whether the omission of a particular coefficient of this setcauses a significant increase of distortion in the motion field or not.This check is performed in a computationally efficient way such that foreach segment each coefficient of the set may be subjected to suchchecking. It is only necessary to transmit those coefficients to thedecoder that are found to increase the distortion significantly ifomitted.

An encoder according to a third aspect of the invention comprises afirst main block operating in a similar way to the first main block ofthe first aspect of the invention. It furthermore comprises a secondmain block equivalent to the fourth main block of the second aspect. Thesecond main block receives for every segment S_(i) a matrix R_(i) and avector z_(i) produced by the first block. If desired, the second mainblock determines based on matrix R_(i) and vector z_(i) for each of thesegments whether it is possible to simplify the motion field model byremoving basis functions from the model without intolerable increase ofdistortion.

The operations in the fourth main block of the third aspect of theinvention are matrix operations, in which the matrix equation is firstlymodified by removing one column and row of the matrix equationR_(i)c=z_(i), c being a vector comprising the coefficients c_(i) of themodel. Removal of one column and row corresponds to removal of one basisfunction from the motion model. Then the matrix equation istriangularised. Motion coefficients corresponding to the reduced set ofbasis functions can be calculated by solving the resulting linearequation. The equation can be solved using back substitution or someother well known algorithm. If the prediction error is used as a measureof distortion, its change for the segment caused by removal of a basisfunction is a simple predetermined function of one term in the resultingequation.

For each segment more coefficients can be removed by repeating thesematrix operations. Using such an approach one can easily find thedistortion when using different reduced sets of basis functions. The setwhich yields the desired level of distortion is selected to representthe motion of the segment.

For every segment processed, the coefficient removal block outputssegment selection information which tells which basis functions wereremoved from the motion field model. Additionally, it outputs new motioncoefficients corresponding to the remaining basis functions. Bothselection information and motion coefficients are transmitted to thedecoder.

The removal of a basis function from a set of basis functions is fullyequivalent to setting the value of a coefficient corresponding to theremoved basis function to zero. Thus, an alternative implementation theencoder, may instead of outputting the selection information followed bymotion coefficients, output all motion coefficients with coefficientscorresponding to removed basis functions having values equal to zero.

Preferably, a motion compensated video encoder takes advantage ofadaptively merging segments, orthogonalising the output of segmentmerging and removing motion coefficients which are not significant forthe overall distortion. Such a preferred encoder includes all fourblocks previously described, namely a QR motion analyser, a segmentmerging block and an orthogonalisation block according to the firstaspect of the invention and also a coefficient removal block accordingto the second aspect of the invention. The orthogonalisation blockreceives the current partitioning of segments and for every segmentS_(k) it also receives matrices R_(k) ¹ and Z_(k) ¹ from the segmentmerging block and orthogonalises the motion field model with respect tothe segments obtained from segment merging. The coefficient removalblock then receives for every segment S_(k) matrices {tilde over(R)}_(k) ¹ and {tilde over (Z)}_(k) ¹ produced by the orthogonalisationblock. Only after merging of adjacent segments and after manipulatingmatrix R¹ and vector z¹ for coefficient removal, the coefficients c_(i)of each segment are calculated for transmission, resulting in asubstantial reduction in the amount of motion data output by the videoencoder.

Preferably, a video encoder and decoder according to the presentinvention is implemented in hardware, e.g. as one or more integratedcircuits, adapted to perform encoding and compressing of received videoframes, and decoding encoded video data, respectively, according to thepresent invention.

It is common in the video coding art that different areas of the videoframe are coded using different coding modes. This variability of codingmodes also includes the motion compensated prediction method used in thecodecs. Several modes are used in all modern video codecs such as theITU H.261 and H.263 as well as the ISO MPEG-1 and MPEG-2 video codingstandards.

For example, some of the video frame areas are coded without using anytemporal prediction at all (so called intra-blocks). No motioncoefficients are transmitted for such image areas and the areas arecoded without reference to any prior images. Practical embodiments ofthe present invention would also combine the invention with suchintra-coding.

Furthermore, in typical video image sequences, large areas of the videoframe remain stationary for the duration of several frames (i.e., thereis no motion). It is computationally much easier to detect that an areahas remained stationary than to estimate the motion of the video framearea. Therefore, practical video codecs often check the area to detectwhether there is any motion at all and include a stationary predictionmode where no motion parameters need to be transmitted. Practicalembodiments of the present invention would also combine the inventionwith such a stationary mode. The MPEG video coding standards alsoinclude coding modes where the motion estimation is performed withrespect to two reference frames (bi-directional prediction). Thisresults in two different predictions for the area to be coded. Theencoder may decide to use the better of these predictions, or it maydecide to combine the two predictions (e.g., by averaging). The decisionregarding the mode needs to be communicated to the decoder. The ITUH.263 standard also includes a temporal prediction mode using tworeference frames. It is clear that the present invention can benefitfrom similar techniques of using multiple reference frames.

It is thus clear for those skilled in the art that the present inventioncan be the basis for one or more coding modes in a video codec where itis used together with prior art coding modes (such as intra coding,stationary modes, or multiple reference frames).

A preferred embodiment of the present invention will now be describedwith reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of a known encoder;

FIG. 2 is a schematic diagram of a known decoder;

FIG. 3 depicts adjacent segments for merging;

FIG. 4 illustrates merging by motion field approximation;

FIG. 5 is a motion field coder according to the preferred embodiment ofthe present invention;

FIG. 6 is a schematic diagram of a QR motion analyser;

FIG. 7 is a schematic diagram of a motion compensated prediction blockaccording to an embodiment of the present invention; and

FIG. 8 is a block diagram of a decoder according to an embodiment of thepresent invention.

DETAILED DESCRIPTION

The output of the video encoder shown in FIG. 1 is the compressed framedivided into segments S_(i) and each of the segments S_(i) isaccompanied by information regarding motion vectors [Δx(x,y), Δy(x,y)]for each pixel (x,y) of the respective segment. Then for a segment S_(i)which consists of P pixels with coordinates (x_(i),y_(i)), i=1,2, . . ., P, the task of the motion field encoder 3 is to find motioncoefficients from the motion vector field(Δx(x_(i),y_(i)),Δy(x_(i),y_(i))) output by the motion field estimationblock 2. The motion coefficients, denoted by c=(c₁, c₂, . . . , c_(N+M))represent a compressed motion vector field [{tilde over (Δ)}x(·),{tildeover (Δ)}y(·)] which approximates [Δx(x,y), Δy(x,y)] as precisely asnecessary using a linear motion model of the form: $\begin{matrix}{{\overset{\sim}{A}\quad {x( {x,y} )}} = {\sum\limits_{i = 1}^{N}{c_{i}{f_{i}( {x,y} )}}}} & \text{(4a)} \\{{\overset{\sim}{A}\quad {y( {x,y} )}} = {\sum\limits_{i = {N + 1}}^{N + M}{c_{i}{f_{i}( {x,y} )}}}} & \text{(4b)}\end{matrix}$

such that the square prediction error SPE is minimised, SPE being givenby: $\begin{matrix}{{SPE} = {\sum\limits_{i = 1}^{P}( {{I_{n}( {x_{i},y_{i}} )} - {R_{ref}( {{x_{i} + {\overset{\sim}{A}\quad {x( {x_{i},y_{i}} )}}},{y_{i} + {\overset{\sim}{A}\quad {y( {x_{i},y_{i}} )}}}} )}} )^{2}}} & (5)\end{matrix}$

FIG. 5 illustrates an embodiment of a motion field encoder in a videoencoder according to the invention. It corresponds to block 3 in FIG. 1but its inputs also include the reference frame and the current frame.The third input to this block is the motion vector field [Δx(·),Δy(·)]produced by motion field estimation block 2, FIG. 1.

To fulfill said task, the motion field encoder 3 consists of four mainbuilding blocks which are the QR motion analyser block 31, the segmentmerging block 32, orthogonalisation block 32 b and motion coefficientremoval block 33. The segment merging block 32, orthogonalisation block32 b and the motion coefficient removal block 33 reduce the amount ofmotion information which may result in a less accurate prediction andhence an increase of the square prediction error.

The objective of the QR motion analyser is to find a new representationof the motion field that is suitable for judging efficiently the impactof segment merging, orthogonalisation and coefficient removal on theprediction error. This new representation is used later in the otherthree blocks for fast and flexible determination of motion coefficientsfor merged segments and for coefficient removal. FIG. 6 shows anembodiment of the QR motion analyser according to this invention. Thisblock comprises a gradient filter 41 receiving a reference video frameinput R_(ref). The outputs G_(x), G_(y) of the gradient filter are inputinto a block 42 for building a matrix E and into a block 45 for buildinga vector y. Matrix building block 42 performs a linearisation of thereference frame R_(ref) such that the approximated reference frame is alinear function of {tilde over (Δ)}x and {tilde over (Δ)}y andcalculates on the basis of this linearisation a matrix E, themultiplication of which with a vector c of coefficients c_(i) inequations 4 a, 4 b above may be interpreted as prediction errorresulting if Δx, Δy are replaced by a linear combination of basisfunctions f_(i)(x,y) of a linear motion model.

Block 45 for building vector y receives the current frame I_(n),reference frame R_(ref), the outputs G_(x), G_(y) of the gradient filter41 and the motion vectors [Δx(x,y),Δy(x,y)] estimated by block 2 in FIG.1 and calculates said vector y from these inputs.

Matrix E and vector y are received by a QR factoriser block 43 and amatrix multiplier block 46, respectively. The function of these blockscan be regarded as a coordinate transformation of matrix E and vector yin order to prepare for finding coefficients c_(i) such that for allpixels of a given segment the prediction error resulting from therepresentation of Δx, Δy as a linear combination of basis functionsf_(i) is as close as possible to the inherent prediction error. Thiswill be explained in further detail below.

Block 43 outputs a matrix R which results from representing matrix E inthe coordinates of a matrix Q also output by block 43. Block 46 receivesnot only said vector y but also said matrix Q from block 43 and finds avector z representing y in the coordinates of matrix Q. Preferablymatrix Q is orthonormal. As will be shown in further detail below thisrepresentation of E and y as R and z, respectively, is very advantageousfor judging whether adjacent segments can be merged with tolerableincrease of prediction error, and also for finding the minimum number ofcoefficients necessary for representing the motion vector field of amerged or non-merged segment, i.e. for removing non-significantcoefficients from the set of coefficients c_(i) in equations 4 a, 4 b.

Blocks 44 and 47 receive matrix R and vector z, respectively and selectrows from these which are required for judging the effect of segmentmerging and/or motion coefficient removal. These operations areperformed based on R and z without the need of calculating saidcoefficients c_(i). Furthermore, all row manipulations refer both to therows of R and to the corresponding rows of z such that R and z can beregarded for the purpose of segment merging and/or motion coefficientremoval as a single column extended matrix A comprising the columns of Rand comprising as an additional column, vector z. Accordingly, blocks 44and 47 can be regarded and implemented as one block for manipulatingmatrix A by selecting appropriate rows of A and for outputting amodified matrix A′ comprising the selected rows of A. A′ comprises theselected rows denoted R¹ of R and the selected rows denoted z¹ of z.

Segment merging block 32 receives R¹ and z¹, i.e. matrix A′, for eachsegment and judges whether merging of two segments S_(i), S_(j) by meansof representing the motion vector fields of both segments with the sameset of coefficients, results in a tolerable increase of predictionerror. This is done by means of generating a row extended matrix Bcomprising all rows of matrix A′_(i) of segment S_(i) and of matrixA′_(j) of segment S_(j). Segments S_(i), S_(j) can be, but do not haveto be, adjacent. Matrix B is subjected to a further coordinatetransformation e.g. by means of triangularisation of matrix B, resultingin a modified matrix B′. Block 32 in FIG. 5 judges whether segmentmerging is possible, from selected elements in matrix B′ in that columnwhich results from vectors z_(i) ¹ and z_(j) ¹ and in rows which havezeros in the columns of B′ resulting from matrices R_(i) ¹ and R_(j) ¹.Preferably, said further coordinate transformation is orthonormal. Thenthe additional prediction error resulting from merging is the sum of thesquare of said selected elements.

Orthogonalisation block 32 b receives for each segment after frameresegmentation said matrix A′ if the corresponding segment remainedunmerged, or matrix B′ for merged segments and merging information fromsegment merging block 32. Block 32 b then modifies matrices A′ or B′ byreplacing the polynomial basis functions which represent the motionvectors of such a segment with orthogonal polynomials. The modifiedmatrices together with the segmentation information, are output to block33. The modified matrices are denoted by Ã¹ and {tilde over (B)}¹,respectively, depending on whether they originate from unmerged ormerged segments.

For each segment the motion coefficient removal block 33 in FIG. 5receives said matrix Ã¹ if the corresponding segment remained unmerged,or matrix {tilde over (B)}¹ for merged segments and judges whetherremoval of coefficients c_(i) is possible with a tolerable increase ofprediction error. This is performed by block 33 by means of extracting arow from matrix Ã¹ or {tilde over (B)}¹, respectively, i.e. the rowcorresponding to coefficient c_(i). The additional prediction errorintroduced due to removing a coefficient can then be calculated from aselected element of said transformed matrix, said selected element beinglocated in the column resulting from z¹ of said transformed matrix andin the row of this matrix which has zeros in all columns resulting fromR¹.

Multiplexer 34 in FIG. 5 receives merging information from block 32,information about which coefficients c_(i) are selected fortransmission, and the selected coefficients c_(i) which are finallycalculated based on said transformed matrix produced by block 33. Theinformation transmitted by multiplexer 34 is then output to the videodecoder (not shown).

In more detail the operation of qr motion analyser consists of followingsteps:

Step 1 is linearisation of the prediction error. In this step thereference frame R_(ref) in equation (5) is approximated using some knownapproximation method so that it becomes linear with respect to [{tildeover (Δ)}x(·),{tilde over (Δ)}y(·)]. Then the elements under the sum informula (5) become linear combinations of coefficients c_(i)$\begin{matrix}{{SPE} = {\sum\limits_{j = 1}^{P}( {{e_{j,1}c_{1}} + {e_{j,2}c_{2}} + \ldots + {e_{j,{N + M}}c_{N + M}} - y_{j}} )^{2}}} & (6)\end{matrix}$

In the preferred implementation a quadratic polynomial motion vectorfield model with 12 coefficients is used:

{tilde over (Δ)}x(x,y)=c ₁ +c ₂ x+c ₃ y+c ₄ xy+c ₅ x ² +c ₆ y ²  (7a)

{tilde over (Δ)}y(x,y)=c ₇ +c ₈ x+c ₉ y+c ₁₀ xy+c ₁₁ x ² +c ₁₂ y ²  (7b)

In practice this model can handle even very complex motion in videosequences very well and yields good prediction results.

In the QR motion analyser block, linearisation in step 1 is done byusing Taylor expansion of R_(ref) at every pixel (x_(i),y_(i)) wherei=1,2, . . . P, around points:

x′ _(i) =x _(i)+Ãx(x _(i) ,y _(i))

y′ _(i) =y _(i)+Ãy(x _(i) ,y _(i))

Using the property that Σa²=Σ(−a)², the square prediction error SPE isthen${SPE} = {\sum\limits_{i = 1}^{P}( {{R_{ref}( {x_{i}^{\prime},y_{i}^{\prime}} )} + {( {{\overset{\sim}{\Delta}\quad {x( {x_{i},y_{i}} )}} - {\Delta \quad {x( {x_{i},y_{i}} )}}} ){G_{x}( {x_{i}^{\prime},y_{i}^{\prime}} )}} + {( {{\overset{\sim}{\Delta}\quad {y( {x_{i},y_{i}} )}} - {\Delta \quad {y( {x_{i},y_{i}} )}}} ){G_{y}( {x_{i}^{\prime},y_{i}^{\prime}} )}} - {I_{n}( {x_{i},y_{i}} )}^{2}} )}$

Auxiliary values g_(j)(x,y) are calculated using formula:${g_{j}( {x_{i},y_{i}} )} = \{ \begin{matrix}{{f_{j}( {x_{i},y_{i}} )}{G_{x}( {x_{i}^{\prime},y_{i}^{\prime}} )}} & {{{{when}\quad j} = 1},2,\ldots \quad,N} \\{{f_{j}( {x_{i},y_{i}} )}{G_{y}( {x_{i}^{\prime},y_{i}^{\prime}} )}} & {{{{when}\quad j} = {N + 1}},{N + 2},\ldots \quad,{N + M}}\end{matrix} $

where function f_(j)(x_(i),y_(i)) is a predefined basis functionaccording to the motion field model as defined in equations (4 a) and (4b) and more specifically, in equations (7 a) and (7 b).

Step 2 is construction of matrices. It is based on the fact thatminimisation of the SPE according to formula (6) is fully equivalent tominimisation of the matrix expression (Ec−y)^(T)(Ec−y), where E and yare: $\begin{matrix}{{E = \begin{bmatrix}e_{1,1} & e_{1,2} & \ldots & e_{1,{N + M}} \\e_{2,1} & e_{2,2} & \ldots & e_{2,{N + M}} \\\vdots & \vdots & ⋰ & \vdots \\e_{P,1} & e_{P,2} & \ldots & e_{P,{N + M}}\end{bmatrix}},{y = \begin{bmatrix}\begin{matrix}\begin{matrix}y_{1} \\y_{2}\end{matrix} \\\vdots\end{matrix} \\y_{p}\end{bmatrix}}} & (8)\end{matrix}$

Matrix E and vector y in equation (8) are built using formulae:$\begin{matrix}{{E = \begin{bmatrix}{g_{1}( {x_{1},y_{1}} )} & {g_{2}( {x_{1},y_{1}} )} & \ldots & {g_{N + M}( {x_{1},y_{1}} )} \\{g_{1}( {x_{2},y_{2}} )} & {g_{2}( {x_{2},y_{2}} )} & \ldots & {g_{N + M}( {x_{2},y_{2}} )} \\\vdots & \vdots & ⋰ & \vdots \\{g_{1}( {x_{P},y_{P}} )} & {g_{2}( {x_{P},y_{P}} )} & \ldots & {g_{N + M}( {x_{P},y_{P}} )}\end{bmatrix}},} \\{y = \begin{bmatrix}{{I_{n}( {x_{1},y_{1}} )} - {R_{ref}( {x_{1}^{\prime},y_{1}^{\prime}} )} + {{G_{x}( {x_{1}^{\prime},y_{1}^{\prime}} )}\overset{¨}{A}\quad {x( {x_{1},y_{1}} )}} + {{G_{y}( {x_{1}^{\prime},y_{1}^{\prime}} )}\overset{¨}{A}\quad {y( {x_{1},y_{1}} )}}} \\{{I_{n}( {x_{2},y_{2}} )} - {R_{ref}( {x_{2}^{\prime},y_{2}^{\prime}} )} + {{G_{x}( {x_{2}^{\prime},y_{2}^{\prime}} )}\overset{¨}{A}\quad {x( {x_{2},y_{2}} )}} + {{G_{y}( {x_{2}^{\prime},y_{2}^{\prime}} )}\overset{¨}{A}\quad {y( {x_{2},y_{2}} )}}} \\\vdots \\{{I_{n}( {x_{P},y_{P}} )} - {R_{ref}( {x_{P}^{\prime},y_{P}^{\prime}} )} + {{G_{x}( {x_{P}^{\prime},y_{P}^{\prime}} )}\overset{¨}{A}\quad {x( {x_{P},y_{P}} )}} + {{G_{y}( {x_{P}^{\prime},y_{P}^{\prime}} )}\overset{¨}{A}\quad {y( {x_{P},y_{P}} )}}}\end{bmatrix}}\end{matrix}$

G_(x)(x,y) and G_(y)(x,y) are values of the horizontal and verticalgradients of the reference frame R_(ref)(x,y) calculated using followingformula:

G_(x)(x,y)=R_(ref)(x+1, y)−R_(ref)(x−1, y),

G_(y)(x,y)=R_(ref)(x, y+1)−R_(ref)(x, y−1)

The pixel values of R_(ref)(x,y), G_(x)(x,y) and G_(y)(x,y) are definedonly for integer coordinates x and y. When x or y are non-integer, thepixel value is calculated e.g. using a bilinear interpolation of closestpixels with integer coordinates.

Step 3 is QR Factorisation. QR factorisation of a matrix is as such wellknown and a suitable algorithm is described in D. H. Golub and C. vanLoan, “Matrix computation” 2nd edition, The Johns Hopkins UniversityPress, 1989. This algorithm can be used to decompose matrix E into aproduct of two matrices

E=Q R  (9)

In other words, R is a representation of E in coordinates of Q. Q ispreferably orthonormal and such that R is upper triangular, i.e. rowsN+M+1 to P of R are all zero.

In this step an auxiliary vector z is also calculated where

 z=Q^(T)y  (10)

In step 4 the output of the QR motion analyser block is calculated. Theoutput comprises a matrix R¹ consisting of the N+M first rows of matrixR and a vector z¹ consisting of the first N+M elements of z.

In the segment merging block the merging operation is performed forpairs of adjacent segments S_(i) and S_(j), see FIG. 4, by judgingwhether for a combined segment S_(ij) the motion vector field can berepresented using a common motion field described by motion coefficientvector c_(ij). The merging operation consists of the following steps:

Step 1 comprises matrix calculation. This invention utilises apreviously unknown property that motion coefficient vector c_(ij)minimising the prediction error in the merged segment S_(ij) alsominimise the scalar value $\begin{matrix}{( {{\begin{bmatrix}R_{i}^{1} \\R_{j}^{1}\end{bmatrix}c_{ij}} - \begin{bmatrix}z_{i}^{1} \\z_{j}^{1}\end{bmatrix}} )^{T}( {{\begin{bmatrix}R_{i}^{1} \\R_{j}^{1}\end{bmatrix}c_{ij}} - \begin{bmatrix}z_{i}^{1} \\z_{j}^{1}\end{bmatrix}} )} & (11)\end{matrix}$

where R_(i) ¹, z_(i) ¹ and R_(j) ¹, z_(k) ¹ are already produced by theQR analyser block for segments S_(i) and S_(j), respectively, asdescribed above. This minimisation of (11) is equivalent to solving inthe least square sense the overdetermined system of equations$\begin{matrix}{{\begin{bmatrix}R_{i}^{1} \\R_{j}^{1}\end{bmatrix}c_{ij}} = \begin{bmatrix}z_{i}^{1} \\z_{j}^{1}\end{bmatrix}} & (12)\end{matrix}$

Step 2 comprises triangularisation of the matrices obtained in step 1.If QR factorisation of E for segment S_(i), that is E_(i), and of E forsegment S_(j),that is E_(j), according to the teaching of theafore-mentioned document is applied, matrices R_(i) ¹, R_(j) ¹ are uppertriangular and the system ${\begin{bmatrix}R_{i}^{1} \\R_{j}^{1}\end{bmatrix}c_{ij}} = \begin{bmatrix}z_{i}^{1} \\z_{j}^{1}\end{bmatrix}$

in (12) has the form: $\begin{matrix}{{\begin{bmatrix}X & X & X & \ldots & X \\\quad & X & X & \ldots & X \\\quad & \quad & X & \ldots & X \\\quad & \quad & \quad & ⋰ & \vdots \\\quad & \quad & \quad & \quad & X \\X & X & X & \ldots & X \\\quad & X & X & \ldots & X \\\quad & \quad & X & \ldots & X \\\quad & \quad & \quad & ⋰ & \vdots \\\quad & \quad & \quad & \quad & X\end{bmatrix}\begin{bmatrix}c_{1} \\c_{2} \\c_{3} \\\vdots \\c_{N + M}\end{bmatrix}} = \begin{bmatrix}z_{1}^{i} \\z_{2}^{i} \\z_{3}^{i} \\\vdots \\z_{N + M}^{i} \\z_{1}^{j} \\z_{2}^{j} \\z_{3}^{j} \\\vdots \\z_{N + M}^{j}\end{bmatrix}} & (13)\end{matrix}$

where symbol x denotes a nonzero element and Z_(k) ^(i) denotes thek^(th) element of vector Z_(k) ^(i), Z_(k) ^(j) denotes the k^(th)element of vector z_(j) ¹.

The system of equation (13) is triangularised by applying a series ofmultiplications of rows by scalars followed by additions of the rows;i.e. it is converted to the form: $\begin{matrix}{{\begin{bmatrix}r_{1,1} & r_{1,2} & r_{1,3} & \ldots & r_{1,{N + M}} \\0 & r_{2,2} & r_{2,3} & \ldots & r_{2,{N + M}} \\0 & 0 & r_{3,3} & \ldots & r_{3,{N + M}} \\0 & 0 & 0 & ⋰ & \vdots \\0 & 0 & 0 & 0 & r_{{N + M},{N + M}} \\0 & 0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 & 0\end{bmatrix}\begin{bmatrix}c_{1} \\c_{2} \\c_{3} \\\vdots \\c_{N + M}\end{bmatrix}} = \begin{bmatrix}z_{1} \\z_{2} \\z_{3} \\\vdots \\z_{N + M} \\q_{1} \\q_{2} \\q_{3} \\\vdots \\q_{N + M}\end{bmatrix}} & (14)\end{matrix}$

For this triangularisation again QR factorisation according to saiddocument mentioned above may be used.

In step 3 the merging error is evaluated. The change of the squareprediction error ΔE_(ij) in the segment S_(ij) caused by merging ofsegments S_(i) and S_(j) is calculated according to $\begin{matrix}{{\Delta \quad E_{ij}} = {\sum\limits_{k = 1}^{N + M}q_{k}^{2}}} & (15)\end{matrix}$

In this preferred embodiment, QR factorisation of (13) results in Qbeing orthonormal, such that equation 15 is very simple. However,depending on the properties of Q in this factorisation, ΔE_(ij) is ingeneral a function of q² _(k), k=1, . . . , N+M, if the squareprediction error is used as a measure for prediction error; of course,other measures for prediction error are feasible and accordingly, otherfunctional relations between q_(k) and ΔE_(ij) may be adopted.

Finally, in step 4 the segments are merged if the change of squareprediction error in formula (15) is acceptable according to a chosencriterion. The segment merging block uses the following strategy forsegment merging:

a. a threshold t is selected which corresponds to the allowed increaseof square prediction error in the whole frame;

b. ΔE_(ij) is calculated for all pairs of adjacent segments usingequation (15);

c. the pair of segments with the smallest ΔE_(ij) is merged.

d. steps b-c are repeated until the sum of ΔE_(ij) corresponding to allmerged pairs of segments is greater than t.

For triangularisation of the system in equation (13) a sequence ofGivens rotations can be used.

For the resulting new segment S_(ij), a matrix R_(ij) ¹ and vectorz_(ij) ¹ are built by taking the first N+M rows of the system inequation (14), i.e. these are given by: $\begin{matrix}{{R_{ij}^{1} = \begin{bmatrix}r_{1,1} & r_{1,2} & r_{1,3} & \ldots & r_{1,{N + M}} \\\quad & r_{2,2} & r_{2,3} & \ldots & r_{2,{N + M}} \\\quad & \quad & r_{3,3} & \ldots & r_{3,{N + M}} \\\quad & \quad & \quad & ⋰ & \vdots \\\quad & \quad & \quad & \quad & r_{{N + M},{N + M}}\end{bmatrix}},{z_{ij}^{1} = \begin{bmatrix}\begin{matrix}\begin{matrix}\begin{matrix}z_{1} \\z_{2}\end{matrix} \\z_{3}\end{matrix} \\\vdots\end{matrix} \\z_{N + M}\end{bmatrix}}} & (16)\end{matrix}$

After all pairs of segments in the frame are considered, the output ofthe segment merging block is obtained. The output comprises three kindsof information. Firstly, it gives a new division of the image with areduced number of segments. Secondly, for each new segment the blockoutputs matrix R_(k) ¹ and vector z_(k) ¹. Thirdly, it gives merginginformation which is sent to the decoder and helps the decoder toidentify segments which were merged.

The motion coefficients c_(k)=(c₁,c₂, . . . c_(N+M)) for the segmentS_(k) could be now calculated by solving the system of equations R_(k)¹c_(k) ¹=z_(k) ¹ but their calculation is not yet necessary if thecoefficient removal block 33 is used. Also, as will be described furtherbelow, at this stage and prior to performing coefficient removal it maybe advantageous to orthogonalise the motion field model with respect tothe segments obtained from segment merging in orthogonalisation block 32b. This block receives as input the partitioning of the current frameinto segments and for every segment S_(k) matrices R_(k) ¹ and Z_(k) ¹shown in (16) from the segment merging block. In this orthogonalisationblock the polynomial basis functions f_(i)(·) are replaced by orthogonalpolynomials {tilde over (f)}_(i)(x,y). Then the motion vector field ofthis segment can be represented as: $\begin{matrix}{{\Delta \quad {\overset{\sim}{x}( {x,y} )}} = {\sum\limits_{i = 1}^{N}{{\overset{\sim}{c}}_{i}{{\overset{\sim}{f}}_{i}( {x,y} )}}}} & \text{(17a)} \\{{\Delta \quad {\overset{\sim}{y}( {x,y} )}} = {\sum\limits_{i = {N + 1}}^{N + M}{{\overset{\sim}{c}}_{i}{{\overset{\sim}{f}}_{i}( {x,y} )}}}} & \text{(17b)}\end{matrix}$

Although the motion vector field in equations 4a and 4b is fullyequivalent to the one in equations 17a and 17b, the latter is usedbecause coefficients {tilde over (c)}_(i) are less sensitive toquantisation than c_(i) and hence can be represented with fewer bits.

Computation of the orthogonal polynomial basis functions is performed asfollows both in the video encoder and in the video decoder, based on theshape of each segment and on the predefined basis functions f_(i) of themotion model.

In general well known orthogonalisation algorithms, e.g. theGramm-Schmidt algorithm, can be used to convert ordinary polynomials topolynomials orthogonal in an arbitrary shaped segment area. However, itis computationally much less complex to orthogonalise the motion fieldbasis functions with respect to the rectangle circumscribing the givensegment.

Orthogonalisation with respect to the rectangle circumscribing the givensegment can be performed as follows. For a rectangle of N₁×N₂ pixels twosequences of one dimensional polynomials, e.g Legendre polynomials, arecomputed:

1. g_(k)(x), k=0,1, . . . orthogonal on the interval [1,N₁],

2. h_(l)(y), l=0,1, . . . orthogonal on the interval [1,N₂],

The two dimensional (2-D) orthogonal polynomial basis functions {tildeover (f)}_(i)(x,y), i=1, . . . , N+M are built as a tensor product of1-D orthogonal polynomials:

{tilde over (f)}_(i)(x,y)=g _(k)(x)h _(l)(y)  (18)

Details on the choice of polynomials can be taken from A. Akansu and R.Haddad, “Multiresolution Signal Decomposition”, Academic Press Inc.,USA, 1992, pages 55 to 56.

Orthogonal polynomial basis functions {tilde over (f)}_(i)(·) are chosenso that they can be represented as a linear combination of polynomialbasis functions f_(k)(·), k=1,2, . . . , i, i.e., $\begin{matrix}{{{{\overset{\sim}{f}}_{i}( {x,y} )} = {\sum\limits_{k = 1}^{i}{t_{k,i}{f_{k}( {x,y} )}}}},{i = 1},2,\ldots \quad,{N + M}} & (19)\end{matrix}$

This assumption guarantees that conversion from non-orthogonal toorthogonal basis functions can be implemented with a low computationalcomplexity by simple matrix multiplications.

Matrices R_(k) ¹ and Z_(k) ¹ describing the motion vector field of thesegment need to be recomputed to reflect the change of the basisfunctions from f_(i)(·) to their orthogonal version {tilde over(f)}_(i)(·). New matrices {tilde over (R)}_(k) ¹ and {tilde over(Z)}_(k) ¹ corresponding to orthogonal polynomial basis functions {tildeover (f)}_(i)(·) which satisfy equation (19) can be computed usingmatrix R_(k) ¹ and vector Z_(k) ¹ according to following formulae:

{tilde over (R)}_(k) ¹=R_(k) ¹T  (20)

{tilde over (Z)}_(k) ¹=Z_(k) ¹  (21)

Matrix T is given by $\begin{matrix}{T = \begin{bmatrix}t_{1,1} & t_{1,2} & t_{1,3} & \ldots & t_{1,{N + M}} \\0 & t_{2,2} & t_{2,3} & \ldots & t_{2,{N + M}} \\0 & 0 & t_{3,3} & \ldots & t_{3,{N + M}} \\\vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & 0 & \ldots & t_{{N + M},{N + M}}\end{bmatrix}} & (22)\end{matrix}$

where elements t_(k,i) are taken from equation (23). The new motionvector coefficients {tilde over (c)}_(k) ¹=({tilde over (c)}₁, {tildeover (c)}₂, . . . , {tilde over (c)}_(N+M)) for the segment S_(ij)corresponding to orthogonal polynomial basis functions {tilde over(f)}_(i)(·), i=1, . . . , N+M, can be calculated using coefficientsc_(k)=(c₁, . . . , c_(N+M)) corresponding to polynomial basis functionsf_(i)(·), i=1, . . . , N+M,

{tilde over (c)}_(k) ¹=T⁻¹c_(k)  (23)

or by solving the system of equations:

{tilde over (R)}_(k) ¹{tilde over (c)}_(k) ¹={tilde over (z)}_(k)¹  (24)

The coefficient removal block 33 receives as its input said new divisionof the current frame into segments and for every segment S_(k) itreceives said matrices {tilde over (R)}_(k) ¹, {tilde over (z)}_(k) ¹,produced previously by the orthogonalisation block. Motion vectors ofevery segment are represented by N+M motion coefficients, N and M beingdetermined by the motion field model for Δx and Δy.

The motion coefficient removal block 33 determines for a given segmentS_(k) whether it is possible to simplify the motion field model, withoutexcessively increasing the prediction error. A simplified motion fieldmodel is obtained when some basis functions are removed from the modelin equations (17a) and (17b) and fewer coefficients are required todescribe such a simplified motion field model.

The following procedure is performed by block 33 for all segments inorder to find out whether the i^(th) basis function (and i^(th)coefficient) can be removed from the motion field model:

Step 1 includes a matrix modification, where the system of linearequations (24)

{tilde over (R)}_(k) ¹{tilde over (c)}_(k)={tilde over (z)}_(k) ¹

is modified by removing i^(th) column from {tilde over (R)}_(k) ¹ andi^(th) element from {tilde over (c)}_(k) ¹.

Step 2 includes a matrix triangularisation, preferably using said QRfactorisation algorithm described in the above-mentioned document, orusing a sequence of Givens rotations. That is, the system in equation(24) is triangularised in a manner known as such, by applying a seriesof multiplications of rows by scalars followed by additions of the rows,i.e. it is converted to the form: $\begin{matrix}{{\begin{bmatrix}{\overset{\sim}{r}}_{1,1} & {\overset{\sim}{r}}_{1,2} & {\overset{\sim}{r}}_{1,3} & \ldots & {\overset{\sim}{r}}_{1,{N + M - 1}} \\0 & {\overset{\sim}{r}}_{2,2} & {\overset{\sim}{r}}_{2,3} & \ldots & {\overset{\sim}{r}}_{2,{N + M - 1}} \\0 & 0 & {\overset{\sim}{r}}_{3,3} & \ldots & {\overset{\sim}{r}}_{3,{N + M - 1}} \\0 & 0 & 0 & ⋰ & \vdots \\0 & 0 & 0 & 0 & {\overset{\sim}{r}}_{{N + M - 1},{N + M - 1}} \\0 & 0 & 0 & 0 & 0\end{bmatrix}\begin{bmatrix}{\overset{\sim}{c}}_{1} \\{\overset{\sim}{c}}_{2} \\{\overset{\sim}{c}}_{3} \\\vdots \\{\overset{\sim}{c}}_{N + M - 1}\end{bmatrix}} = \begin{bmatrix}{\overset{\sim}{z}}_{1} \\{\overset{\sim}{z}}_{2} \\{\overset{\sim}{z}}_{3} \\\vdots \\{\overset{\sim}{z}}_{N + M - 1} \\{\overset{\sim}{q}}_{1}\end{bmatrix}} & (25)\end{matrix}$

Step 3 includes error evaluation. The change of the square predictionerror for the segment caused by removal of the i^(th) coefficient issimply equal to the term {tilde over (q)}² _(i) in equation (25). Again,this is valid based on Q in said QR factorisation being orthonormal. Ingeneral, depending on the properties of Q and the measure for predictionerror, the change of the square prediction error is a function of {tildeover (q)}_(i).

Step 4 includes removal of coefficients. If the change of the predictionerror is acceptable according to a chosen criterion then the coefficientc_(i) is removed from the coefficient vector. The new number ofcoefficients is now N+M−1. Matrix {tilde over (R)}_(k) ¹ and vector{tilde over (z)}_(k) ¹ are modified e.g. by means of QR factorisation toform: $\begin{matrix}{{{\overset{\sim}{R}}_{k}^{1} = \begin{bmatrix}{\overset{\sim}{r}}_{1,1} & {\overset{\sim}{r}}_{1,2} & {\overset{\sim}{r}}_{1,3} & \ldots & {\overset{\sim}{r}}_{1,{N + M - 1}} \\\quad & {\overset{\sim}{r}}_{2,2} & {\overset{\sim}{r}}_{2,3} & \ldots & {\overset{\sim}{r}}_{2,{N + M - 1}} \\\quad & \quad & {\overset{\sim}{r}}_{3,3} & \ldots & {\overset{\sim}{r}}_{3,{N + M - 1}} \\\quad & \quad & \quad & ⋰ & \vdots \\\quad & \quad & \quad & \quad & {\overset{\sim}{r}}_{{N + M - 1},{N + M - 1}}\end{bmatrix}},{{\overset{\sim}{z}}_{k}^{1} = \begin{bmatrix}{\overset{\sim}{z}}_{1} \\{\overset{\sim}{z}}_{2} \\{\overset{\sim}{z}}_{3} \\\vdots \\{\overset{\sim}{z}}_{N + M - 1}\end{bmatrix}}} & (26)\end{matrix}$

The number of coefficients for the segment can be reduced further byrepeating the steps 1-4 based on equation (26).

In the motion coefficient removal block the following strategy forcoefficient removal is used:

a. a threshold t is selected which corresponds to an allowed increase ofsquare prediction error in the whole frame;

b. {tilde over (q)}_(i) ² is calculated for all segments and their basisfunctions using equation (25);

c. a basis function of a segment with smallest {tilde over (q)}_(i) ² isremoved;

d. steps b-c are repeated until the sum of all {tilde over (q)}_(i) ²terms corresponding to all removed basis functions in different segmentsis greater than t.

Finally, step 5 includes coefficient calculation. After removal ofsuitable coefficients in this step the final motion coefficients for asegment S_(k) are calculated by solving the system of linear equations(24):

{tilde over (R)}_(k) ¹{tilde over (c)}_(k)={tilde over (z)}_(k) ¹

where matrix {tilde over (R)}_(k) ¹ and vector {tilde over (z)}_(k) ¹are the result of the previous steps 1-4. The equation can be solvedusing one of well known algorithms, e.g. backsubstitution.

FIG. 7 shows an embodiment of motion compensated prediction block 1 ofFIG. 1. This block receives motion information output by motion fieldcoding block 3 and furthermore receives a reference frame R_(ref)(x,y).Based on this information, block 1 outputs a predicted frame P_(n)(x,y).As shown in FIG. 7, motion compensated prediction bock 1 comprises ademultiplexer 11 receiving multiplexed motion information from motionfield encoding block 3 and outputting demultiplexed motion informationcomponents, i.e. image partitioning information, coefficient selectioninformation and, finally, the value of the transmitted motioncoefficients. Reference numeral 12 denotes an image partitioning blockreceiving said image partitioning information and said reference frameR_(ref) and outputting segments of the frame resulting from partitioningthe image according to the image partitioning information. Referencenumeral 13 denotes a basis functions building block. This block selectsfrom a predefined set of basis functions the particular basis functionsindicated in the selection information generated by the motioncoefficient removal block 33 in motion field encoding block 3. Referencenumeral 14 denotes a segment prediction block which receives for eachsegment of said reference frame R_(ref) the associated selection ofbasis functions and the associated motion coefficients, calculates themotion vectors [{tilde over (Δ)}x, {tilde over (Δ)}y] and based on thesemotion vectors, calculates the predicted frame P_(n)(x,y) for each pixel(x,y) of each segment. Motion compensated prediction block 1 correspondsin its structure and function to motion compensated prediction block 21of the video decoder depicted in FIG. 2. Both motion compensatedprediction blocks base the prediction on the motion information outputby motion field coding block 3 of the video encoder shown in FIG. 1.

FIG. 8 is a block diagram of a motion compensated prediction blockaccording to an embodiment of the present invention. The figure showsthe main blocks of an decoder according, comprising:

means 81 for storing a video data frame;

means 82 for predicting a video data frame based on said stored videodata frame and on received motion information;

means 84 for decoding received prediction error data and obtaining aprediction error frame; and

means 85 for calculating and outputting an updated video data framebased on said predicted video data frame and said decoded predictionerror frame, and storing the updated video data frame in said storingmeans;

said means for predicting a video data frame comprising

means 11 for demultiplexing received motion data into at least two ofthe following: data concerning the partitioning of said updated videodata frame into segments S_(i), data concerning a selection of basisfunctions from a set of motion field model basis functions, and dataconcerning coefficients of selected basis functions;

means 13 for reconstructing said motion vector field in each segmentS_(i) from a linear combination of said selected basis functions andcoefficients; and

means 83 for calculating said prediction frame based on saidreconstructed motion vector field and based on said stored video dataframe.

As a result of all of the steps in all the blocks, the motion fieldencoder according to the invention produces merging information forinforming the decoder which segments are merged, selection informationinforming the decoder which basis functions are removed and motioncoefficient information.

The main advantage of the present invention over prior art solutions isits ability to reduce the amount of motion information by a large factorwithout causing a large increase of prediction error. Additionally thecomplexity of the overall system is low which allows practicalimplementation on available signal processors or general purposemicroprocessors.

The segment merging block has the ability of finding motion vectors ofcombined segments from given motion vectors estimated for separatesegments. It can be proven that the motion vectors it produces are infact optimal in terms of maintaining low square error for the combinedsegment. This explains the ability of this block to dramatically reducethe number of segments with only very modest increase of squareprediction error.

Use of an orthogonalisation block according to the invention provides amotion field model which is less sensitive to quantisation errors andtherefore fewer bits can be used to quantise the motion coefficients.

The motion coefficient removal block is a very powerful tool forinstantaneous adaptation of the motion model to the actual amount andtype of motion in the video scene. This block can easily test the resultof prediction (value of square prediction error for a segment) with avery large number of models, e.g., with all possible combinations ofmotion field basis functions. A strong advantage of this scheme is thatit does not need to repeat the process of motion estimation and hence iscomputationally simple.

By using motion estimation followed by QR motion analysis the motionfield coder can find new motion coefficients for any desired combinationof image segments or any desired model of the motion field in thesegment by solving very simple systems of linear equations.

According to the present invention the segment merging,orthogonalisation and coefficient removal blocks are preferably combinedto provide a greater degree of motion data reduction with a smallreduction of image quality.

The system can be implemented in a variety of ways without departingfrom the spirit and the scope of the invention. For instance, differentlinear motion models can be used in equation (3). Different methods canbe used to linearise the term in the formula (5). Further, differentcriteria may be used to decide whether to merge or not to merge twosegments. The strategy for deciding whether a given basis functionshould be removed from the model may vary. Triangularisation of matricesin equations (12) and (24) can be performed using various algorithms andcalculation of final coefficients by solving equation (24) can be doneusing a number of known algorithms for solving systems of linearequations. Different interpolation methods may also be used to determinethe values of R_(ref)(x,y), G_(x)(x,y) and G_(y)(x,y) at non-integercoordinates.

What is claimed is:
 1. Encoder for performing motion compensatedencoding of video data, comprising: motion field estimating means,having an input for receiving a first video data frame I_(n) and areference frame R_(ref), said motion field estimating means beingarranged to estimate a motion vector field describing scene motiondisplacements of video frame pixels; motion field encoding means havingan input to receive from said motion field estimating means saidestimated motion vector field; partitioning information indicatingpartitioning of a video frame into at least two segments said segmentsbeing a first segment S_(i) and a second segment S_(j); said motionfield encoding means being arranged to obtain compressed motioninformation comprising first motion coefficients representing saidmotion vector field; motion compensated prediction means for predictinga predicted video data frame based on said reference frame R_(ref) andsaid compressed motion information; computing means having an input forreceiving said first video data frame and said predicted video dataframe, said computing means being arranged to calculate a predictionerror frame based on said predicted video data frame and on said firstvideo data frame; prediction error encoding means for encoding saidprediction error frame; wherein said motion encoding means furthercomprises: means for calculating for each segment a distortion matrix Eand a distortion vector y such that a predefined measure ΔE fordistortion in each segment, due to approximating said motion vectorfield as coefficients c_(i) of a set of polynomial basis function f_(i),is a function of (Ec−y), c being a vector of said motion coefficientsc_(i); means for decomposing said distortion matrix E into a firstmatrix Q and a second matrix R such that det Q≠0 and Q R=E, a subset ofthe set of all columns of matrix Q being a basis of a vector spacedefined by all possible linear combinations of all column vectors ofmatrix E, columns of matrix Q being orthogonal to each other; means forcalculating an auxiliary vector z according to z=Q⁻¹y, Q⁻¹ being theinverse matrix of said first matrix Q; means for generating for eachsegment a column extended matrix A comprising the columns of matrix Rand vector z as an additional column, and for selecting all rows ofmatrix A which have elements unequal to zero in all columns due tomatrix R; means for merging segments based on selective combination ofsegments producing an increase in said prediction error within a certainlimit; means for generating a row extended matrix B comprising saidselected rows of matrix A of said first segment S_(i) and said selectedrows of matrix A of said second segment S_(j); means for performing aseries of multiplications of rows of matrix B with scalars unequal tozero and additions of rows of matrix B in order to obtain a modifiedmatrix B′ having in the columns due to matrix R as many rows as possiblefilled with zeros; orthogonalising means receiving one of said matricesA, B and B′ as an input matrix M, said orthogonalising means beingarranged to replace said polynomial basis functions f_(i) by orthogonalbasis functions {tilde over (f)}_(i) and to calculate second motioncoefficients {tilde over (c)} using said orthogonal basis functions andsaid input matrix M; and quantisation means for quantising said secondcoefficients {tilde over (c)}.
 2. Encoder for performing motioncompensated encoding of video data, comprising: motion field estimatingmeans, having an input for receiving a first video data frame I_(n) anda reference frame R_(ref), said motion field estimating means beingarranged to estimate a motion vector field describing motiondisplacements of video frame pixels; motion field encoding means havingan input to receive from said motion field estimating means saidestimated motion vector field; partitioning information indicatingpartitioning of a video frame into at least two segments said segmentsbeing a first segment S_(i) and a second segment S_(j); said motionfield encoding means being arranged to obtain compressed motioninformation comprising first motion coefficients representing saidmotion vector field; motion compensated prediction means for predictinga predicted video data frame based on said reference frame R_(ref) andsaid compressed motion information; computing means having an input forreceiving said first video data frame and said predicted video dataframe, said computing means being arranged to calculate a predictionerror frame based on said predicted video data frame and on said firstvideo data frame; prediction error encoding means for encoding saidprediction error frame; wherein said motion encoding means furthercomprises: means, for calculating for each segment a distortion matrix Eand a distortion vector y such that a predefined measure ΔE distortionin each segment, due to approximating said motion vector field ascoefficients c_(i) of a set of polynomial basis function f_(i), is afunction of (E c−y), c being a vector of said motion coefficients c_(i);means for decomposing said distortion matrix E into a first matrix Q anda second matrix R such that det Q≠0, and Q R=E, a subset of the set ofall columns of matrix Q being a basis of a vector space defined by allpossible linear combinations of all column vectors of matrix E, columnsof matrix Q being orthogonal to each other; means for calculating anauxiliary vector z according to z=Q⁻¹y, Q⁻¹ being the inverse matrix ofsaid first matrix Q; means for generating for each segment a columnextended matrix A comprising the columns of matrix R and vector z as anadditional column, and for selecting all rows of matrix A which haveelements unequal to zero in all columns due to matrix R; means formerging segments based on selective combination of segments producing anincrease in said prediction error within a certain limit; means forgenerating a row extended matrix B comprising said selected rows ofmatrix A of said first segment S_(i) and said selected rows of matrix Aof said second segment S_(j); means for performing a series ofmultiplications of rows of matrix B with scalars unequal to zero andadditions of rows of matrix B in order to obtain a modified matrix B′having in the columns due to matrix R as many rows as possible filledwith zeros; orthogonalising means receiving on of said matrices A, B andB′ as an input matrix M, said orthogonalising means being arranged toreplace said polynomial basis functions f_(i) by orthogonal basisfunctions {tilde over (f)}_(i) and to modify said input matrix to athird matrix {tilde over (M)} corresponding to said orthogonal basisfunctions; removing means having an input for receiving of said thirdmatrix {tilde over (M)}, said removing means being arranged to modifysaid third matrix by removing from said third matrix the i^(th) columndue to R corresponding to the i^(th) basis function of said orthogonalbasis functions, and said removing means having an output to provide amatrix {circumflex over (M)}; means for computing second motioncoefficients {tilde over (c)} using said fourth matrix {circumflex over(M)}; and quantisation means for quantizing said second motioncoefficients {tilde over (c)}.
 3. Encoder according to claim 1, whereinsaid orthogonalisation means comprises: means for choosing orthogonalbasis functions {tilde over (f)}_(i)(·) as linear combinations ofpolynomial basis functions f_(k)(·), k=1,2, . . . , i, wherein${{{\overset{\sim}{f}}_{i}( {x,y} )} = {\sum\limits_{k = 1}^{i}{t_{k,i}{f_{k}( {x,y} )}}}},{i = 1},2,\ldots \quad,{N + M}$

where k and i are indexes, t values are multipliers and N+M is thenumber of coefficients representing the motion vector field of a givensegment.
 4. Encoder according to claim 2, wherein said orthogonalisationmeans comprises: means for choosing orthogonal basis functions {tildeover (f)}_(i)(·) as linear combinations of polynomial basis functionsf_(k)(·), k=1,2, . . . , i, wherein${{{\overset{\sim}{f}}_{i}( {x,y} )} = {\sum\limits_{k = 1}^{i}{t_{k,i}{f_{k}( {x,y} )}}}},{i = 1},2,\ldots \quad,{N + M}$

where k and i are indexes, t values are multipliers and N+M is thenumber of coefficients representing the motion vector field of a givensegment.
 5. Encoder according to claim 3, wherein said motion fieldencoding means further comprises: means for computing {tilde over (c)}using the relationship R¹T{tilde over (c)}=z¹ wherein$T = \begin{bmatrix}t_{1,1} & t_{1,2} & t_{1,3} & \ldots & t_{1,{N + M}} \\0 & t_{2,2} & t_{2,3} & \ldots & t_{2,{N + M}} \\0 & 0 & t_{3,3} & \ldots & t_{3,{N + M}} \\\vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & 0 & \ldots & t_{{N + M},{N + M}}\end{bmatrix}$

matrix R¹ contains the columns of said matrix M due to said matrix R;and said vector Z¹ contains a column of matrix M due to vector z. 6.Encoder according to claim 2, wherein said motion field encoding meansfurther comprises: means for computing a matrix {tilde over (R)}¹ and avector {tilde over (z)}¹ satisfying the following formulae:${{\overset{\sim}{R}}^{1} = {R^{1}T}},{{\overset{\sim}{Z}}^{1} = Z^{1}},{wherein}$${T = \begin{bmatrix}t_{1,1} & t_{1,2} & t_{1,3} & \ldots & t_{1,{N + M}} \\0 & t_{2,2} & t_{2,3} & \ldots & t_{2,{N + M}} \\0 & 0 & t_{3,3} & \ldots & t_{3,{N + M}} \\\vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & 0 & \ldots & t_{{N + M},{N + M}}\end{bmatrix}};$

 matrix R¹ contains the columns of said matrix M due to said matrix R;said vector z¹ contains a column of matrix M due to vector z; and meansfor computing {tilde over (c)} according to the following equation:{circumflex over (R)}¹{tilde over (c)}={circumflex over (z)}¹; whereinmatrix {circumflex over (R)}¹ contains the columns of said matrix{circumflex over (M)} due to said matrix R; and vector {circumflex over(z)}¹ contains a column of matrix {circumflex over (M)} due to saidvector z.
 7. Method of motion compensated encoding of video data,comprising the steps: a) receiving a first video data I_(n) and areference frame R_(ref), estimating a motion vector field describingmotion displacements of video frame pixels, and outputting said firstvideo frame, said motion vector field and said reference frame R_(ref);b) receiving said estimated motion vector field; partitioninginformation indicating partitioning of a video frame into at least twosegments, said segments being a first segment S_(i) and a second segmentS_(j), for obtaining compressed motion information comprising firstmotion coefficients representing said motion vector field; c) predictinga predicted video data frame based on said reference frame R_(ref) andsaid compressed motion information; d) receiving said first video dataframe to calculate a prediction error frame based on said predictedvideo data frame and on said first video data frame; e) encoding saidprediction error frame; wherein said method further comprises the steps:g) calculating for each segment a distortion matrix E and a distortionvector y such that a predefined measure ΔE for distortion in eachsegment, due to approximating said motion vector field as coefficientsc_(i) of a set of polynomial basis functions f_(i), is a function of (Ec−y), c being a vector of said motion coefficients c_(i); h) decomposingsaid distortion matrix E into a first matrix Q and a second matrix Rsuch that det Q≠0 and Q R=E,  a subset of the set of all columns ofmatrix Q being a basis of a vector space defined by all possible linearcombinations of all column vectors of matrix E, columns of matrix Qbeing orthogonal to each other; i) calculating an auxiliary vector zaccording to z=Q⁻¹ y, Q⁻¹ being the inverse matrix of said first matrixQ; j) generating for each segment a column extended matrix A comprisingthe columns of matrix R and vector z as an additional column, and forselecting all rows of matrix A which have elements unequal to zero inall columns due to R; k) merging segments based on selective combinationof segments producing an increase in said prediction error within acertain limit; l) generating a row extended matrix B comprising saidselected rows of matrix A of said first segment S_(i) and said selectedrows of matrix A of said second segment S_(j); m) performing a series ofmultiplications of rows of matrix B with scalars unequal to zero andadditions of rows of matrix B in order to obtain a modified matrix B′having in the columns due to matrix R as many rows as possible filledwith zeros; o) receiving one of said matrices A, B and B′ as an inputmatrix M in order to replace said polynomial basis functions f_(i) byorthogonal basis functions {tilde over (f)}_(i) and to calculate secondmotion coefficients {tilde over (c)} using said orthogonal basisfunctions and said input matrix M; and q) quantising said second motioncoefficients {tilde over (c)}.
 8. Method of motion compensated encodingof video data, comprising the steps: a) receiving a first video dataframe I_(n) and a reference frame R_(ref), estimating a motion vectorfield describing motion displacements of video frame pixels, andoutputting said first video frame, said motion vector field and saidreference frame R_(ref); b) receiving said estimated motion vectorfield; partitioning information indicating partitioning of a video frameinto at least two segments, said segments being a first segment S_(i)and a second segment S_(j), for obtaining compressed motion informationcomprising first motion coefficients representing said motion vectorfield; c) predicting a predicted video data frame based on saidreference frame R_(ref) and said compressed motion information; d)receiving said first video data frame to calculate a prediction errorframe based on said predicted video data frame and on said first videodata frame; e) encoding said prediction error frame; wherein said methodfurther comprises the steps: g) calculating for each segment adistortion matrix E and a distortion vector y such that a predefinedmeasure ΔE for distortion in each segment, due to approximating saidmotion vector field as coefficients c_(i) of a set of polynomial basisfunctions f_(i), is a function of (E c−y), c being a vector of saidmotion coefficients c_(i); h) decomposing said distortion matrix E intoa first matrix Q and a second matrix R such that det Q≠0 and Q R=E, asubset of the set of all columns of matrix Q being a basis of a vectorspace defined by all possible linear combinations of all column vectorsof matrix E, columns of matrix Q being orthogonal to each other; i)calculating an auxiliary vector z according to z=Q⁻¹ y, Q⁻¹ being theinverse matrix of said first matrix Q; j) generating for each segment acolumn extended matrix A comprising the columns of matrix R and vector zas an additional column, and for selecting all rows of matrix A whichhave elements unequal to zero in all columns due to matrix R; k) mergingsegments based on selective combination of segments producing anincrease in said prediction error within a certain limit; l) generatinga row extended matrix B comprising said selected rows of matrix A ofsaid first segment S_(i) and said selected rows of matrix A of saidsecond segment S_(j); m) performing a series of multiplications of rowsof matrix B with scalars unequal to zero and additions of rows of matrixB in order to obtain a modified matrix B′ having in the columns due tomatrix R as many rows as possible filled with zeros; n) receiving one ofsaid matrices A, B, and B′ as an input matrix M in order to replace saidpolynomial basis functions f_(i) by orthogonal basis functions {tildeover (f)}_(i) and to modify said input matrix to a third matrix {tildeover (M)} corresponding to said orthogonal basis functions; o) receivingsaid third matrix {tilde over (M)} to modify said third matrix {tildeover (M)} to a fourth matrix {circumflex over (M)} by removing from saidthird matrix {tilde over (M)} the i^(th) column due to matrix Rcorresponding to the i^(th) basis function of said orthogonal basisfunctions, and outputting said fourth matrix {circumflex over (M)}; p)computing second motion coefficients {tilde over (c)} using said fourthmatrix {circumflex over (M)}; and q) quantising said second motioncoefficients {tilde over (c)}.
 9. Method of motion compensated encodingof video data according to claim 7, wherein said method furthercomprises the step choosing orthogonal basis functions {tilde over(f)}_(i)(·) as linear combinations of polynomial basis functionsf_(k)(·), k=1,2, . . . , i, wherein${{{\overset{\sim}{f}}_{i}( {x,y} )} = {\sum\limits_{k = 1}^{i}{t_{k,i}{f_{k}( {x,y} )}}}},{i = 1},2,\ldots \quad,{N + M}$

Where k and i are indexes, t values are multipliers and N+M is thenumber of coefficients representing the motion vector field of a givensegment.
 10. Method of motion compensated encoding of video dataaccording to claim 8, wherein said method further comprises the stepchoosing orthogonal basis functions {tilde over (f)}_(i)(·) as linearcombinations of polynomial basis functions f_(k)(·), k=1,2, . . . , i,wherein${{{\overset{\sim}{f}}_{i}( {x,y} )} = {\sum\limits_{k = 1}^{i}{t_{k,i}{f_{k}( {x,y} )}}}},{i = 1},2,\ldots \quad,{N + M}$

where k and i are indexes, t values are multipliers and N+M is thenumber of coefficients representing the motion vector field of a givensegment.
 11. Method of motion compensated encoding of video dataaccording to claim 9, wherein said method further comprises the step of:computing {tilde over (c)} using the relationship R¹T{tilde over(c)}=z¹: wherein ${T = \begin{bmatrix}t_{1,1} & t_{1,2} & t_{1,3} & \ldots & t_{1,{N + M}} \\0 & t_{2,2} & t_{2,3} & \ldots & t_{2,{N + M}} \\0 & 0 & t_{3,3} & \ldots & t_{3,{N + M}} \\\vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & 0 & \ldots & t_{{N + M},{N + M}}\end{bmatrix}};$

matrix R¹ contains the columns of said matrix M due to said matrix R;and vector z¹ contains a column of said matrix M due to said vector z.12. Method of motion compensated encoding of video data according toclaim 10, wherein said method further comprises the steps computing amatrix {tilde over (R)}¹ and a vector {tilde over (z)}¹ satisfyingfollowing formulae: {tilde over (R)}¹=R¹T, {tilde over (z)}¹=Z¹, wherein${T = \begin{bmatrix}t_{1,1} & t_{1,2} & t_{1,3} & \ldots & t_{1,{N + M}} \\0 & t_{2,2} & t_{2,3} & \ldots & t_{2,{N + M}} \\0 & 0 & t_{3,3} & \ldots & t_{3,{N + M}} \\\vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & 0 & \ldots & t_{{N + M},{N + M}}\end{bmatrix}};$

wherein R¹ contains the columns of said matrix M due to said matrix R;vector z¹ contains a column of said matrix M due to said vector z; andcomputing {tilde over (c)} according to the following equation: {circumflex over (R)}¹{tilde over (c)}={circumflex over (z)}¹; whereinmatrix {circumflex over (R)}¹ contains the columns of said matrix{circumflex over (M)} due said matrix R; and vector {circumflex over(z)}¹ contains column of matrix {circumflex over (M)} due to said vectorz.
 13. Decoder for decoding of motion compensation encoded video data,said decoder comprising: means for storing a video data frame; means forpredicting a video data frame based on said stored video data frame andon received motion information; means for decoding received predictionerror data and obtaining a prediction error frame; and means forcalculating and outputting an updated video data frame based on saidpredicted video data frame and said decoded prediction error frame, andstoring the updated video data frame in said storing means; said meansfor predicting a video data frame comprising: means for demultiplexingreceived motion data into at least two of the following: data indicatingpartitioning of said updated video data frame into segments S_(k), dataindicating a selection of basis functions from a set of motion fieldmodel basis functions, and data indicating coefficients of selectedbasis functions; means for reconstructing said motion vector field ineach segment S_(k) from a linear combination of said selected basisfunctions and coefficients; and means for calculating said predictionframe based on said reconstructed motion vector field and based on saidstored video data frame.
 14. Decoder according to claim 13, wherein saidmeans for reconstructing said motion vector field is adapted to receivedata regarding the structure of each member of said set of basisfunctions for each segment S_(i) of said updated video data frame. saidstored video data frame.
 15. Decoder for decoding of motion compensationencoded video data, said decoder comprising: means for storing a videodata frame; means for predicting a video data frame based on said storedvideo data frame and on received motion information; means for decodingreceived prediction error data and obtaining a prediction error frame;and means for calculating and outputting an updated video data framebased on said predicted video data frame and said decoded predictionerror frame, and storing the undated video data frame in said storingmeans; said means for predicting a video data frame comprising: meansfor demultiplexing received motion data into at least two of thefollowing: data indicating partitioning of said undated video data frameinto segments S_(k), data indicating a selection of basis functions froma set of motion field model basis functions, and data indicatingcoefficients of selected basis functions; means for reconstructing saidmotion vector field in each segment S_(k) from a linear combination ofsaid selected basis functions and coefficients; and means forcalculating said prediction frame based on said reconstructed motionvector field and based on said stored video data frame; wherein saidmeans for reconstructing said motion vector field is adapted tocalculate said set of motion field model basis functions for eachsegment S_(k) as being orthogonal to each other with respect to an areadetermined by the shape of the segment S_(k).
 16. System for handlingvideo data, comprising an encoder for performing motion compensatedencoding of video data and a decoder for decoding motion compensatedvideo data, said encoder comprising: motion field estimating means,having an input for receiving a first video data frame I_(n) and areference frame R_(ref), said motion field estimating means beingarranged to estimate a motion vector field describing motiondisplacements of video frame pixels; motion field encoding means havingan input to receive from said motion field estimating means saidestimated motion vector field; partitioning information indicatingpartitioning of a video frame into at least two segments said segmentsbeing a first segment S_(i) and a second segment S_(j); said motionfield encoding means being arranged to obtain compressed motioninformation comprising first motion coefficients representing saidmotion vector field; motion compensated prediction means for predictinga predicted video data frame based on said reference frame R_(ref) andsaid compressed motion information; computing means having an input forreceiving said first video data frame and said predicted video dataframe, said computing means being arranged to calculate a predictionerror frame based on said predicted video data frame and on said firstvideo data frame; prediction error encoding means for encoding saidprediction error frame; said motion encoding means further comprising:means for calculating for each segment a distortion matrix E and adistortion vector y such that a predefined measure ΔE for distortion ineach segment, due to approximating said motion vector field ascoefficients c_(i) of a set of polynomial basis functions f_(i), is afunction of (E c−y), c being a vector of said motion coefficients c_(i);means for decomposing said distortion matrix E into a first matrix Q anda second matrix R such that det Q≠0 and Q R=E, a subset of the set ofall columns of matrix Q being a basis of a vector space defined by allpossible linear combinations of all column vectors of matrix E, columnsof matrix Q being orthogonal to each other; means for calculating anauxiliary vector z according to z=Q⁻¹ y, Q⁻¹ being the inverse matrix ofsaid first matrix Q; means for generating for each segment a columnextended matrix A comprising the columns of matrix R and vector z as anadditional column, and for selecting all rows of matrix A which haveelements unequal to zero in all columns due to R; means for mergingsegments based on selective combination of segments producing anincrease in said prediction error within a certain limit; means forgenerating a row extended matrix B comprising said selected rows ofmatrix A of said first segment S_(i) and said selected rows of matrix Aof said second segment S_(j); means for performing a series ofmultiplications of rows of matrix B with scalars unequal to zero andadditions of rows of matrix B in order to obtain a modified matrix B′having in the columns due to matrix R as many rows as possible filledwith zeros; orthogonalising means receiving one of said matrices A, Band B′ as an input matrix M, said orthogonalising means being arrangedto replace said polynomial basis function f_(i) by orthogonal basisfunctions {tilde over (f)}_(i) and to calculate second motioncoefficients {tilde over (c)} using said orthogonal basis functions andsaid input matrix M; and quantisation means for quantizing said secondcoefficients {tilde over (c)}; and said decoder comprising: means forstoring a video data frame; means for predicting video data fame basedon said stored video data frame and on received motion information;means for decoding received prediction error data and obtaining aprediction error frame; and means for calculating and outputting anupdated video data frame based on said predicted video data frame andsaid decoded prediction error frame, and storing the updated video dataframe in said storing means; said means for predicting a video dataframe further comprising: means for demultiplexing received motion datainto at least two of the following: data indicating partitioning of saidupdated video data frame into segments S_(k), data indicating aselection of basis functions from a set of motion field model basisfunctions, and data indicating coefficients of selected basis functions;means for reconstructing said motion vector field in each segment S_(k)from a linear combination of said selected basis functions andcorresponding coefficients; and means for calculating said predictionframe based on said reconstructed motion vector field and based on saidstored video data frame.
 17. System for handling video data, comprisingan encoder for performing motion compensated encoding of video data anda decoder for decoding motion compensated video data, said encodercomprising: motion field estimating means having an input for receivinga first video data frame I_(n) and a reference frame R_(ref), saidmotion field estimating means being arranged to estimate a motion vectorfield describing motion displacements of video frame pixels; motionfield encoding means having an input to receive from said motion fieldestimating means said estimated motion vector field; partitioninginformation indicating partitioning of a video frame into at least twosegments said segments being a first segment S_(i) and a second segmentS_(j); said motion field encoding means being arranged to obtaincompressed motion information comprising first motion coefficientsrepresenting said motion vector field; motion compensated predictionmeans for predicting a predicted video data frame based on saidreference frame R_(ref) and said compressed motion information;computing means having an input for receiving said first video dataframe and said predicted video data frame, said computing means beingarranged to calculate a prediction error frame based on said predictedvideo data frame and on said first video data frame; prediction errorencoding means for encoding said prediction error frame; said motionencoding means further comprising: means for calculating for eachsegment a distortion matrix E and a distortion vector y such that apredefined measure ΔE for distortion in each segment, due toapproximating said motion vector field as coefficients c_(i) of a set ofpolynomial basis functions f_(i), is a function of (E c−y), c being avector of said motion coefficients c_(i); means for decomposing saiddistortion matrix E into a first matrix Q and a second matrix R suchthat det Q≠0, and Q R=E, a subset of the set of all columns of matrix Qbeing a basis of a vector space defined by all possible linearcombinations of all column vectors of matrix E, columns of matrix Qbeing orthogonal to each other; means for calculating an auxiliaryvector z according to z=Q⁻¹ y, Q⁻¹ being the inverse matrix of saidfirst matrix Q; means for generating for each segment a column extendedmatrix A comprising the columns of matrix R and vector z as anadditional column, and for selecting all rows of matrix A which haveelements unequal to zero in all columns due to matrix R; means formerging segments based on selective combination of segments producing anincrease in said prediction error within a certain limit; means forgenerating a row extended matrix B comprising said selected rows ofmatrix A of said first segment S_(i) said selected rows of matrix A ofsaid second segment S_(j); means for performing a series ofmultiplications of rows of matrix B with scalars unequal to zero andadditions of rows of matrix B in order to obtain a modified matrix B′having in the columns due to matrix R as many rows as possible filledwith zeros; orthogonalising means receiving one of said matrices A, Band B′ as an input matrix M, said orthogonalising means being arrangedto replace said polynomial basis functions f_(i) by orthogonal basisfunctions {tilde over (f)}_(i) and to modify said input matrix to athird matrix {tilde over (M)} corresponding to said orthogonal basisfunctions; removing means having an input for receiving said thirdmatrix {tilde over (M)}, said removing means being arranged to modifysaid third matrix by removing from said third matrix the i^(th) columndue to R corresponding to the i^(th) basis function of said orthogonalbasis functions, and said removing means having an output to provide afourth matrix {circumflex over (M)}; means for computing second motioncoefficients {tilde over (c)} using said fourth matrix {circumflex over(M)}; and quantisation means for quantizing said second motioncoefficients {tilde over (c)}; and said decoder comprising: means forstoring a video data frame; means for predicting a video data framebased on said stored video data frame and on received motioninformation; means for decoding received prediction error data andobtaining a prediction error frame; and means for calculating andoutputting an updated video data frame based on said predicted videodata frame and said decoded prediction error frame, and storing theupdated video data frame in said storing means; said means forpredicting a video data frame comprising: means for demultiplexingreceived motion data into at least two of the following: data concerninga partitioning of said updated video data frame into segments S_(k),data concerning a selection of basis functions from a set of motionfield model basis functions, and data concerning coefficients ofselected basis functions; means for reconstructing said motion vectorfield in each segment S_(k) from a linear combination of said selectedbasis functions and corresponding coefficients; and means forcalculating said prediction frame based on said reconstructed motionvector field and based on said stored video data frame.
 18. System forhandling video data according to claim 16, further comprising means totransmit video data from said encoder to said decoder.
 19. System forhandling video data according to claim 16, further comprising means tostore video data encoded using said encoder and means to decompressencoded video data by said decoder.
 20. System for handling video dataaccording to claim 17, further comprising means to transmit video datafrom said encoder to said decoder.
 21. System for handling video dataaccording to claim 17, further comprising means to store video dataencoded by said encoder and means to decompress encoded video data bysaid decoder.
 22. Method of decoding motion compensation encoded videodata, said method comprising: storing a video data frame; predicting avideo data frame based on said stored video data frame and on receivedmotion information; decoding received prediction error data andobtaining a prediction error frame; and calculating and outputting anupdated video data frame based on said predicted video data frame andsaid decoded prediction error frame, and storing the updated video dataframe; predicting of a video data frame comprising: demultiplexingreceived motion data into at least two of the following: basis functionselection data indicating a selection of basis functions from a set ofmotion field model basis functions and coefficient values for selectedbasis functions, and partitioning data indicating the partitioning ofsaid updated video data frame into segments Sk; reconstructing saidmotion vector field in each segment Sk from a linear combination of saidselected basis functions and corresponding coefficient values; andcalculating said prediction frame based on said reconstructed motionvector field and based on said stored video data frame.
 23. Method ofdecoding of motion compensation encoded video data according to claim22, wherein said reconstructing of said motion vector field comprisesreceiving an indication of the structure of each member of said set ofbasis functions for each segment S_(k) of said updated video data frame.24. Method of decoding of motion compensation encoded video dataaccording to claim 22, wherein said reconstructing of said motion vectorfield comprises calculating said set of motion field model basisfunctions for each segment S_(k) as being orthogonal to each other withrespect to an area determined by the shape of the segment S_(k). 25.Decoder for decoding of motion compensation encoded video data, saiddecoder comprising; means for storing a video data frame; means forpredicting a video data frame based on said stored video data frame andon received motion information; means for decoding received predictionerror data and obtaining a prediction error frame; and means forcalculating and outputting an updated video data frame based on saidpredicted video data frame and said decoded prediction error frame, andstoring the updated video data frame in said storing means; said meansfor predicting a video data frame comprising: means for demultiplexingreceived motion data into at least basis function selection dataindicating a selection of basis functions from a set of motion fieldmodel basis functions and coefficient values for selected basisfunctions, and partitioning data indicating the partitioning of saidupdated video data frame into segments S_(k); means for reconstructingsaid motion vector field in each segment S_(k) from a linear combinationof said selected basis functions and corresponding coefficient values;and means for calculating said prediction frame based on saidreconstructed motion vector field and based on said stored video dataframe.
 26. Decoder according to claim 25, wherein said means forreconstructing said motion vector field is adapted to receive anindication of the structure of each member of said set of basisfunctions for each segment S_(k) of said updated video data frame. 27.Decoder according to claim 25, wherein said means for reconstructingsaid motion vector field is adapted to calculate said set of motionfield model basis functions for each segment S_(k) as being orthogonalto each other with respect to an area determined by the shape of thesegment S_(k).
 28. Encoder for performing motion compensated encoding ofvideo data according to claim 1, further comprising: means fortransmitting said second motion coefficients and said prediction errorframe to a decoder.
 29. Encoder for performing motion compensatedencoding of video data according to claim 2, further comprising: meansfor transmitting said second motion coefficients and said predictionerror frame to a decoder.
 30. Method of motion compensated encoding ofvideo data according to claim 7, further comprising the step: f)transmitting said second motion coefficients and said prediction errorframe to a decoder.
 31. Method of motion compensated encoding of videodata according to claim 8, further comprising the step: f) transmittingsaid second motion coefficients and said prediction error frame to adecoder.
 32. System according to claim 16, further comprising: means fortransmitting said second motion coefficients and said prediction errorframe to a decoder.
 33. System according to claim 17, furthercomprising: means for transmitting said second motion coefficients andsaid prediction error frame to a decoder.